This plugin was created to streamline the generation of image chips with "labels" to be fed into machine learning (ML) algorithms. In order to facilitate robust training, we want to generate a large number of image chips across a wide range of acquisition parameters. Those include:

  • Different targets and/or variants of targets

  • Different backgrounds (the context within which the target appears)

  • Different ground sampling distances (GSDs)

  • Different sensor view angles (zenith and azimuth)

  • Different illumination angles (zenith and azimuth)

Historically this has been accomplished using external scripting with a conventional DIRSIG simulation. The primary goal of this plugin is to make it easy to configure all the degrees of freedom in one location and have the plugin manage the creation of the images.

This plugin makes several assumptions and employs simplifications in how it models some elements of the simulation. Most of these choices were made in light of what training data and test data for ML algorithms looks like. Specifically that most ML workflows employ 8-bit and/or 24-bit images and various physical parameters of the sensor, scene, atmosphere, etc. are generally irrelevant. For example, the algorithm isn’t aware of the size pixels on the focal plane and the effective focal length. But it is aware of the GSD of the images. Likewise, the algorithm isn’t explicitly aware of a hazy maritime atmosphere vs. clear desert atmosphere. But it is aware that some images have lower contrast and some images have higher contrast. In light of the context of how these images are generally used in ML workflows, many of the approaches employed in this plugin have been simplified to streamline the setup of these simulations.

Camera Modeling

The modeling of the camera has been simplified to avoid the user needing detailed system specifications that are largely irrelevant in the context in which the output images are used. For example, the user defines the GSD directly rather than the physical size of pixel elements on the focal plane and an effective focal length. As a result, the object to image plane projection is orthogonal rather than perspective. Because the final imagery (the PNG, JPEG, etc. images used with the ML algorithm) won’t have physical units, it is not important to have detailed spectral response functions for each channel. Hence, the definition of spectral channels is limited to a simple bandpass defined by an lower and upper wavelength and the response is assumed to be uniform across that bandpass. There are options to incorporate the effective point-spread function (PSF) of the system, but that PSF is currently assumed to be constant across all channels.

Atmospheric Modeling

The ChipMaker plugin in DIRSIG5 is technically a combo plugin because it binds to both the sensor API (to drive the image formation) and the atmosphere API (to drive the source direction, direct illumination and diffuse illumination). This plugin offers two approaches to atmospheric modeling, depending on the required fidelity. Neither is a fully physics-driven approach:

  1. The default model is a simple analytical model. The total irradiance from the hemisphere is spectrally constant and partitioned between direct and diffuse components. There is no path scattering or path transmission between the sensor and the scene. For a physics-based, remote sensing simulation tool this seems like an inappropriate simplification of the real world. However, utilizing a physics-based atmosphere model (for example, MODTRAN) would entail an enormous amount of computations since every chip would involve a unique view and illumination geometry. At this time, the reality is that calibrated images are rarely used to either train ML algorithms and ML algorithms are rarely supplied calibrated image to analyze. Hence, it doesn’t matter if the exact transmission and scattering is modeled because the algorithms are typically working with 8-bit, 24-bit, etc. images where the impacts of path transmission and scattering manifest as relative contrast differences in the image. Therefore, the approach here is to capture the multiplicative transmission loss and additive scattering gain in the conversion from output reflectance to integer count images. For example, a hazy atmosphere (high scattering, low transmission) can be emulated as a linear radiance to counts scaling that has a lower gain and higher bias when compared to a clearer atmosphere.

  2. If more fidelity is required, a higher fidelity approach to atmospheric modeling is offered, called the FourCurveAtmosphere. This approach is an approximation of atmospheric effects through four spectral curves that are a function of solar zenith angle. The four curves are: (1) the ground-reaching solar irradiance, (2) the path radiance per unit distance, (3) the path extinction per unit distance and (4) the hemispherically integrated sky irradiance. This approach improves upon the simple parametric model while not requiring massive atmospheric radiative transfer computations unique for each chip. This plugin utilizes a pre-computed atmospheric database and each DIRSIG release includes a ready-to-run database containing a large number of atmospheric conditions. Alternatively, the user can generate their own database of conditions using the fourcurve_builder tool.


The input file for the plugin is a JSON formatted file. An example file is shown below and will be discussed section by section. See the ChipMaker2 demo for a working example.

    "atmosphere" : {
        "database" : "./my_atm_db.hdf",
        "conditions" : [
            "nice_conditions", "yucky_conditions", "ok_conditions"
    "camera" : {
        "image_size" : {
            "x" : 128,
            "y" : 128
        "gsd_range" : {
            "minimum" : 0.05,
            "maximum" : 0.10
        "channellist" : [
                "name" : "Red",
                "minimum" : 0.6,
                "maximum" : 0.7
                "name" : "Green",
                "minimum" : 0.5,
                "maximum" : 0.6
                "name" : "Blue",
                "minimum" : 0.4,
                "maximum" : 0.5
        "readout" : {
            "frame_time" : 1e-03,
            "integration_time" : 1e-04
        "psf" : {
            "image" : "circle_psf.png",
            "scale" : 10.0
        "image_filename" : {
            "basename" : "chip",
            "extension" : "img"
        "truth" : [
            "scene_x", "scene_y", "scene_z", "geometry_index"
    "time_range" : {
        "minimum" : 0,
        "maximum" : 0
    "view" : {
        "zenith_range" : {
            "minimum" : 5,
            "maximum" : 40
        "azimuth_range" : {
            "minimum" : 0,
            "maximum" : 360
    "source" : {
        "zenith_range" : {
            "minimum" : 5,
            "maximum" : 40
        "azimuth_range" : {
            "minimum" : 0,
            "maximum" : 360
    "setup" : {
        "random_seed" : 54321,
        "target_tags" : [ "box", "sphere" ],
        "options" : [ "with_and_without" ],
        "count" : 100,
        "report_filename" : "labels.txt"

Atmosphere (optional)

This optional section is only required if using the FourCurveAtmosphere model for the atmosphere.

Important If you want to use the original parametric atmosphere, then do not include the atmosphere section at all.

The path to the FourCurveAtmosphere HDF database. The interface control document for this database can be found here. More information about the default database can be found in the README in $DIRSIG_HOME/lib/data/atm. Examples of making your own database can be found in the FourCurveAtm1 demo or the FourCurveAtmosphere manual. To use the default FourCurveAtmosphere database, then do not include the database variable or assign it an empty string (for example, "database" : "").


The list of atmospheric conditions from the database to use. If more than one is given, they will be selected randomly from the list. To use all the conditions in the FourCurveAtmosphere database, then do not include the conditions variable or assign it an empty array (e.g. "conditions" : []).


The camera description utilizes parameters that are image-centric rather than camera-centric. What that means is that rather than specifying the physical size of the pixels in the array, an effective focal length, etc. the user specifies the dimensions of the image and the GSD. The camera is currently modeled as an ortho camera, to avoid camera specific distortions that are beyond the scope of the camera model.


The size of the image frames to be generated in x (width) and y (height).


The user can (optionally) provide a range of GSDs to model. If the user wants all the images to have the same GSD, then set the minimum and maximum to the same value. If this range is not provided, the plugin will automatically compute the GSD so that each target fits within the image.


The user can specify a set of channels to be modeled by the sensor. The channels are assumed to have simple uniform responses across the spectral bandpass defined by the minimum and maximum variables. The name variable specifies the name that will be used for the corresponding band in the output image.


The user specifies the file "basename" and "extension" and the simulation will write images to files using a basenameX.extension naming pattern, where X is the index of the chip.


The pixels can integrate using either a global shutter where all pixels are integrated synchronously and then readout. The pixels can also be integrated asynchronously in a line-by-line manner to emulate either a rolling shutter or a pushbroom scanning sensor. The global (synchronous) integration method is the default, and the integration_time is the duration that every pixel is integrated for. To enable the line-by-line (asynchronous) integration method, the frame_time must be set and the line-to-line delay is assumed to the frame time / number of lines. In this case, the integration_time is the duration that every line of pixels is integrated for.


The user can optionally request truth for each image. This will be output as additional bands in the the image files.


The user can optionally describe the point spread function (PSF) of the system using an image file. The image variable is used to supply the name of the file containing the PSF image (PNG, JPEG, TIFF, GIF). Because the contribution area described in the PSF image is usually much larger than the pixel, the scale variable is used to describe the width of that image in pixel units.


Scenes that contain motion (moving objects) can be sampled as a function of time, which allows the moving objects to be imaged in different locations and/or orientations (as defined by their respective motion). The range of sample times is defined in the time_range section of the input. The minimum and maximum times are relative and in seconds.


The range of view directions for the camera is defined in the view section of the input. The zenith (declination from nadir) and azimuth (CW East of North) are supplied as minimum and maximum pairs. These angles are in degrees.

The optional offset_range will introduce a spatial offset of the target within the image. The range is used to generate a random XY offset to the selected targets location. The values are in meters. The default offset is 0 meters.

The optional distance_range will vary the "flying height" of the sensor. Given the orthographic projection of the chips, this parameter is normally irrelevant, but is useful to vary the amount of path radiance or extinction present in the chips when using the FourCurveAtmosphere model. The values are in meters. The default distance is 1000 meters.


The direction of the source (sun) with relation to the target is defined in the view section of the input. The zenith (declination from nadir) and azimuth (CW East of North) are supplied as minimum and maximum pairs. This angles are in degrees.


The setup section of the file specifies the overall setup of the simulation to be performed, including the specification of which targets to sample, the number of images to be generated and the name of the file containing key label information.


The list of tags used to select the targets in the scene to be imaged.


The number of image chips to generate.


The random set of targets, view directions, source directions, etc. can be expected to change from simulation to simulation because the seed for the random number generator that drives these random parameters is different for each execution. If the user desires the ability to reproduce a specific simulation, then they can supply the random_seed variable to fix it so that it won’t change.


There are several options related to how the simulation runs. See below for more detail.


The ASCII/text report that describes the target, view angles, illumination angles, GSD, etc. for each image chip is written to the filename provided by this variable.


The following options control how the simulation is performed.


This option will cause the simulation to hide all the other targets in the selection set while the chip for a given target is being generated. In the example above the selection set includes anything that has the tags "box" and "sphere". Therefore each chip will be centered on a "box" or sphere". With this option included, all other "box" and "sphere" objects will be hidden except for the one being imaged. Note that "cylinders" (not included in the example tag set) will not be a chip target or be hidden when imaging any of the "box" or "sphere" targets.


This option will cause the simulation to produce A/B image pairs with and without the current target present. If there are N chips requested (see the count variable in the setup), the resulting images will be named chip0a.img (contains the target) and chip0b.img (same parameters, but without the target).


This option allows the user to reproduce a set of images using the output label report (see the report_filename variable in the setup) from a previous simulation. When using this mode, rather than choose a random target, random view, etc. it will use the parameters (target index, time, GSD, source angles, etc.) from the report file. Note that if the scene changes (specifically if, new targets are added), then the output image set will be different.


This option will create a small meta-data file for each chip that contains meta data for that chip in an easily parsable JSON format (for example, chip10.img will have a file named chip10.meta).

Note The with_and_without and rerun_from_report options cannot be combined at this time.


Illumination and output units

When using the FourCurveAtmosphere model, the output units are radiance in Watts/cm2 sr.

When not using the FourCurveAtmosphere model, the direct/diffuse illumination partitioning is fixed at 80% and 20%, respectively. The total hemispherical irradiance is currently Pi, which results in output images that have units of total reflectance. This allows the end user to easy calibrate the images into whatever units space they desire.

Image format

The output image is an binary/text ENVI image data/header file pair. The image data is single-precision floating-point and the bands are written in a band-interleaved by pixel (BIP) order. The first N bands in the image contain the channels defined in the sensor description. The remaining bands contain the truth data requested.

Meta Data Files

If the optional meta-data files are generated (enabled via the make_meta option), the a small JSON file is created for each chip image. This file contains the values for all the key variables for the chip.

An example meta-data JSON file.
    "image_filename" : "chip01.img",
    "image_size" : [180,180],
    "target_index" : 57,
    "gsd" : 0.281232,
    "time" : 102.417,
    "view_zenith" : 0.24056,
    "view_azimuth" : 5.71626,
    "view_distance" : 1000,
    "source_zenith" : 0.276679,
    "source_azimuth" : 4.24691,
    "atm_conditions" : "mls_rural_15km_dis8.tp5",
    "tags" : ["aircraft","ka27","choppers","idle","korean"]


To use the ChipMaker plugin in DIRSIG5, the user must use the newer JSON formatted simulation input file (referred to a JSIM file with a .jsim file extension). At this time, these files are hand-crafted (no graphical editor is available). An example is shown below:

    "scene_list" : [
        { "inputs" : "./demo.scene" }
    "plugin_list" : [
            "name" : "ChipMaker",
            "inputs" : {
                "input_filename" : "./chips.json"

The ChipMaker2 Demo

The ChipMaker2 contains a working example of this plugin. The demo contains a scene that includes 100s of helicopters and a few planes. There are 3 material variants of the same helicopter (Indian, Korean and Russian schemes) and for each of those there are "idle" (rotor is not spinning) and "active" (rotors are spinning) variants. The plane is defined with a dynamic instance that has it landing on the runway. The image below shows what the scene looks like at a given time.

Figure 1. Overhead view of the ChipMaker2 demo scene.

To get chips of the helicopters there are many tags to choose from. The general "choppers" tag will get all the helicopters. The "idle" tag will get all the idle helicopters regardless of country scheme. The "russian" tag will get the "idle" and "active" helicopters with the Russian material scheme. The image below shows a set of chips just using the general "chopper" tag:

chopper chips
Figure 2. A chipset using the "choppers" tag.

For the helicopters, the rotors are spinning but they all have static instances (each helicopter is fixed at it’s position). Use of the time_range would mean each chip looks at the object at a different time within the time range. For the helicopters, that would mean seeing the rotors in different positions. Since the plane is moving (it has a dynamic instance), the time_range can be employed with greater effect. In this case, it will find the plane in different locations during the time window. The chip set below is all of the same plane, but using a time range that images it well above the runway on approach through when it has landed (and everywhere in between):

plane chips
Figure 3. A chipset using the "planes" tag.