Skip to main content

What's new in version 1.2.1? (May 10, 2021)

Existing users of Doodler should be aware of the major changes in version 1.2.1, posted May 10. There are a lot of them, and they mean you will use Doodler a little differently going forward. I hope you like the changes!

First, Doodler has a new documentation website (you know this because you are here).

There are so many new features in this release they are organized by theme ...

GUI:#

  • Model independence factor replaces the previous "color class tolerance" parameter (the mu parameter in the CRF). Higher numbers allow the model to have greater ability to 'undo' class assignments from the RF model. Typically, you want to trust the RF outputs and want to keep this number small.
  • Blur factor replaces the previous "Blurring parameter" (the theta parameter in the CRF). Larger values means more smoothing.
  • The CRF controls so Blur factor, then Model independence factor , then downsample and finally probability of doodle. These are in the order of likelihood that you will need to tweak.
  • There is no longer an option to apply a median filter. Its usage was generally disaapointing/problematic and has been replaced with an alternaive method (see below)
  • The 'show/compute segmentation' button is now blue, so it stands out a little better
  • On the RF controls, there is no longer the option to change the 'sigma range'. SIGMA_MAX is now hard-coded at 16. SIGMA_MIN is 1. Tests reveal insensitivity to these parameters. Keeping them as options presents problems for other areas of the workflow; by enforcing the same range of sigma scales, there are the same number (75) output feature maps used by the RF

I/O:#

  • JPEG files with all extensions (.jpg, .jpeg, or .JPG) are now usable inputs
  • greyscale and annotations no longer saved to png file, instead to numpy area (npz compressed), which encodes
    • 'image'' = image
    • 'label' = one-hot-encoded label array
    • 'color_doodles' = color 3D or color doodles
    • 'doodles' = 2D or greyscale doodles
  • the npz file is overwritten, but old arrays are kept, prefixed with '0', and prepended with another '0', such that the more '0's the newer, but the above names without '0's are always the newest. Color images are still produced with time tags.
  • DEFAULT_CRF_DOWNSAMPLE = 4 by default
  • In implementation using predict_folder.py, the user decides between two modes, saving either default basic outputs (final output label) or the full stack out outputs for debugging or optimizing

Modeling:#

  • All input images are now standardized and rescaled [0,1]. This results in better portability of RF models, and is general good practice to deal with large outlier values.
  • The CRF label is now reszied with no antialiasing, and is inpainted at transition areas between classes
  • Decreased maximum number of allowable samples in RF model to 200,000
  • Small holes and islands in the one-hot encoded RF and CRF masks are now removed. The threshold area in pixels is 2*W, where W is the width of the image in pixels.
  • Median filtering is now removed. It is no longer needed, creates problems, extra buttons/complexity. Instead ...
  • Implements 'one-hot encoded mask spatial filtering'
  • Implements inpainting on regions spatially filtered
  • Pen width is used as-is; it is no longer exponentially scaled
  • SIGMA_MAX=16; SIGMA_MIN=1. Hardcoded. Easier to manage number of features, which now have to be 75. Also, they make very little difference
  • in predict_folder, extracted features are memory mapped to save RAM

Other:#

  • RF feature extraction is now carried out in parallel, and CRF 'test time augmentation' is now in parallel. This should speed things up
  • utils/plot_label_generation.py is a new script that plots all the minutae of the steps involved in label generation, making plots and large npz files containing lots of variables I will explain later. By default each image is modeled with its own random forest. Uncomment "#do_sim = True" to run in 'chain simulation mode', where the model is updated in a chain, simulating what Doodler does.
  • utils/convert_annotations2npz.py is a new script that will convert annotation label images and associated images (created and used respectively by/during a previous incarnation of Doodler)
  • utils/gen_npz_4_zoo.py is a new script that will strip just the image and one-hot encoded label stack image for model training with Zoo