Skip to main content

Changes in v 1.2.6

Just a quick note about changes in 08/28/21. v 1.2.6

  • npz files now use compression, so they will be smaller

  • doodler by default now uses threading, so it should be faster

    You can disable this by editing the environment/settings.py and setting THREADING to False

  • new variable 'orig_image' written to npz, as well as 'image'

    they are identical except for occasions where converting imagery to floats creates a 4th band, to accommodate a nodata band in the input. so 'orig image ' and 'image' would have different uses for segmentation and are therefore both preserved. orig_image is now the one written to npz files for zoo

  • new function gen_images_and_labels_4_zoo.py, which creates greyscale jpegs and 3-band color images (orig images) and writes them to the results folder

  • updated the other utility functions to accommodate the new variable orig_images, which now takes priority over the variable images

  • website and README updated with details about all of the utility scripts

What's new in version 1.2.1? (May 10, 2021)

Existing users of Doodler should be aware of the major changes in version 1.2.1, posted May 10. There are a lot of them, and they mean you will use Doodler a little differently going forward. I hope you like the changes!

First, Doodler has a new documentation website (you know this because you are here).

There are so many new features in this release they are organized by theme ...

GUI:#

  • Model independence factor replaces the previous "color class tolerance" parameter (the mu parameter in the CRF). Higher numbers allow the model to have greater ability to 'undo' class assignments from the RF model. Typically, you want to trust the RF outputs and want to keep this number small.
  • Blur factor replaces the previous "Blurring parameter" (the theta parameter in the CRF). Larger values means more smoothing.
  • The CRF controls so Blur factor, then Model independence factor , then downsample and finally probability of doodle. These are in the order of likelihood that you will need to tweak.
  • There is no longer an option to apply a median filter. Its usage was generally disaapointing/problematic and has been replaced with an alternaive method (see below)
  • The 'show/compute segmentation' button is now blue, so it stands out a little better
  • On the RF controls, there is no longer the option to change the 'sigma range'. SIGMA_MAX is now hard-coded at 16. SIGMA_MIN is 1. Tests reveal insensitivity to these parameters. Keeping them as options presents problems for other areas of the workflow; by enforcing the same range of sigma scales, there are the same number (75) output feature maps used by the RF

I/O:#

  • JPEG files with all extensions (.jpg, .jpeg, or .JPG) are now usable inputs
  • greyscale and annotations no longer saved to png file, instead to numpy area (npz compressed), which encodes
    • 'image'' = image
    • 'label' = one-hot-encoded label array
    • 'color_doodles' = color 3D or color doodles
    • 'doodles' = 2D or greyscale doodles
  • the npz file is overwritten, but old arrays are kept, prefixed with '0', and prepended with another '0', such that the more '0's the newer, but the above names without '0's are always the newest. Color images are still produced with time tags.
  • DEFAULT_CRF_DOWNSAMPLE = 4 by default
  • In implementation using predict_folder.py, the user decides between two modes, saving either default basic outputs (final output label) or the full stack out outputs for debugging or optimizing

Modeling:#

  • All input images are now standardized and rescaled [0,1]. This results in better portability of RF models, and is general good practice to deal with large outlier values.
  • The CRF label is now reszied with no antialiasing, and is inpainted at transition areas between classes
  • Decreased maximum number of allowable samples in RF model to 200,000
  • Small holes and islands in the one-hot encoded RF and CRF masks are now removed. The threshold area in pixels is 2*W, where W is the width of the image in pixels.
  • Median filtering is now removed. It is no longer needed, creates problems, extra buttons/complexity. Instead ...
  • Implements 'one-hot encoded mask spatial filtering'
  • Implements inpainting on regions spatially filtered
  • Pen width is used as-is; it is no longer exponentially scaled
  • SIGMA_MAX=16; SIGMA_MIN=1. Hardcoded. Easier to manage number of features, which now have to be 75. Also, they make very little difference
  • in predict_folder, extracted features are memory mapped to save RAM

Other:#

  • RF feature extraction is now carried out in parallel, and CRF 'test time augmentation' is now in parallel. This should speed things up
  • utils/plot_label_generation.py is a new script that plots all the minutae of the steps involved in label generation, making plots and large npz files containing lots of variables I will explain later. By default each image is modeled with its own random forest. Uncomment "#do_sim = True" to run in 'chain simulation mode', where the model is updated in a chain, simulating what Doodler does.
  • utils/convert_annotations2npz.py is a new script that will convert annotation label images and associated images (created and used respectively by/during a previous incarnation of Doodler)
  • utils/gen_npz_4_zoo.py is a new script that will strip just the image and one-hot encoded label stack image for model training with Zoo

How to use viz_npz.py

Purpose#

This is a simple utility that allows you to visualize the contents of npz files

Usage#

From the main Doodler folder ...

cd utils

The syntax is this:

python viz_npz.py [-t {0}/1/2/3 ]

Arguments#

-t (input npz file type)`

  • 0: (default) normal Doodler output file
  • 1: labelgen
  • 2: npz_zoo

Use case: I want to print doodles and labels from of each npz file from a Doodler session#

python viz_npz.py

which is equivalent to

python viz_npz.py -t 0

Use case: I want to print doodles and the RF and CRF labels from of each npz file created by running plot_label_generation#

python viz_npz.py -t 1

Use case: I want to print doodles and the RF and CRF labels from of each npz file created by running gen_npz4zoo (i.e. the inputs to segmentation Zoo)#

python viz_npz.py -t 2

How to split up large photos

Doodler works well with small to medium sized imagery where the features and objects can be labeled without much or any zoom or pan. This depends a lot on the image resolution and content so it is difficult to make general guidelines.

But it's easy enough to chop images into pieces, so you should experiment with a few different image sizes.

Let's start with this image called big.jpg:

I recommend the command-line program imagemagick, available for all major platforms. It's an incredibly powerful and useful set of tools for manipulating images. You can use the imagemagick command line tools for splitting and merging imagery. We use the magick command (convert on some Linux distributions)

Halves#

Split into two lengthways:

magick big.jpg -crop 50%x100% +repage fordoodler_%02d.jpg

Quarters#

Following the same logic, to chop the image into quarters, use:

magick big.jpg -crop 50%x50% +repage quarters_fordoodler_%02d.jpg

The first two quarters are shown below:

Specific size#

To chop the image into tiles of a specific size, for example 1024x1024 pixels, use:

magick big.jpg -crop 1024x1024 +repage px1024_fordoodler_%02d.jpg

The first three tiles are shown below:

Easy peasy!

Recombine#

After you've labeled, you may want to recombine your label image. Imagemagick includes the montage tool that is handy for the task. For example, the image quarters can be recombined like this:

magick montage -mode concatenate -tile 2x2 quarters*.jpg recombined.jpg

and the equivalent command to combine the two vertical halves is:

magick montage -mode concatenate -tile 2x1 fordoodler*.jpg recombined.jpg

Happy image cropping!

Splitting up large geoTIFF orthomosaics

Doodler can work with really large images, but it is usually best to keep your images < 10,000 pixels in any dimension, because then the program will do CRF inference on the whole image at once rather than in chunks. This usually results in better image segmentations that are more consistent with your doodles.

So, this post is all about how you make smaller image tiles from a very large geoTIFF format orthomosaic, using python. The smaller tiles will also be written out as image tiles, with their relative position in the larger image described in the file name, for easy reassembly

We'll need a dependency not included in the doodler environment: gdal

conda install gdal

Now, in python:

import os, gdal
from gdalconst import *
from glob import glob

How large do you want your output (square) image tiles to be? (in pixels)

tilesize = 5000

What images would you like to chop up?

bigfiles = [
'Sandwich/2017-01-09_Sandwich_5cm_ortho.tif',
'Sandwich/2017-02-14_Sandwich_5cm_ortho.tif',
'Sandwich/2017-03-16_Sandwich_5cm_ortho.tif',
'Sandwich/2018-01-10_Sandwich_5cm_ortho.tif',
]

List the widths and heights of those input bigfiles

widths = [13314, 13314, 13314, 19972]
heights = [6212, 6212, 6212, 9319]

Specify a new folder for each set of image tiles (one per big image)

folders = ['Sandwich/2017-01-09_5cm', 'Sandwich/2017-02-14_5cm',\
'Sandwich/2017-03-16_5cm','Sandwich/2017-01-10_5cm']

Make file name prefixes by borrowing the folder name:

prefixes = [f.split('/')[-1] for f in folders]

Finally, loop through each file, chop it into chunks using gdal_translate, called by an os.system() command. Then moves the tiles into their respective folders

for b,f,p in zip(bigfiles, folders, prefixes):
# chop the image into chunks
for i in range(0, widths[k], tilesize):
for j in range(0, heights[k], tilesize):
gdaltranString = "gdal_translate -of GTIFF -srcwin "+str(i)+", "+str(j)+", "+str(tilesize)+", " \
+str(tilesize)+" "+b+" "+p+"_"+str(i)+"_"+str(j)+".tif"
os.system(gdaltranString)
##move those chunks to a directory
os.mkdir(f)
os.system('mv '+p+'*.tif '+f)

merge a 3-band and 1-band image

Doodler can use 1, 3, and 4-band input imagery. If the imagery is 3-band, it is assumed to be RGB and is, by default, augmented with 3 additional derivative bands.

But how do you make a 4-band image from a 3-band image and a 1-band image?

That additional 1-band might be that acquired with an additional sensor, but might more commonly be a DEM (Digital Elevation Model) corresponding to the scene.

I know of two ways. If you have gdal binaries installed, first strip the image into its component bands using gdal_translate

gdal_translate -b 1 data/images/4_rgb.png red.png
gdal_translate -b 2 data/images/4_rgb.png green.png
gdal_translate -b 3 data/images/4_rgb.png blue.png

Then merge them together using gdal_merge.py

gdal_merge.py -separate -o merged.tiff -co PHOTOMETRIC=MINISBLACK red.png green.png blue.png data/images/4_elev.png

The equivalent in python can be acheived without the gdal bindings, using the libraries already in your doodler conda environment

First, import libraries

import tifffile
import cv2
import numpy as np

Read RGB image

im1 = cv2.imread('data/images/4_rgb.png')

Read elevation and get just the first band (if this is 3-band)

im2 = cv2.imread('data/images/4_elev.png')[:,:,0]

If you had a 1-band elevation image, it would be this instead...

im2 = cv2.imread('data/images/4_elev.png')

Merge bands - creates a numpy array with 4 channels

merged = np.dstack((im1, im2))

Write the image to file

tifffile.imsave('test.tiff', merged)

You can use the following to read it back in

merged = tifffile.imread('test.tiff')

And verify with 'shape' - it should be 4 bands

merged.shape