Video Series: Hands-On AI—Part 3: Augmentation Transformations

Learn some of the transformations used for data augmentation, an important aspect when training a model to work in varied environments. Explore rotation, horizontal shift, vertical shift, shearing, zoom, and horizontal and vertical flip transformations with examples.

Welcome back. Karl, here, with this third video in our Hands-On AI series. In this video, I cover some common data augmentation techniques and functions.

The reason you want to augment the data is to expand a smaller dataset into a larger one. It also allows us to train more robust models for real-world applications.

Rotation is the first technique. And it refers to rotating an image either clockwise or counterclockwise. The parameter that allows the rotation is called rotation range. It specifies the range of rotations in degrees from which the random angle will be chosen. Quick note, during the rotation the size of the image remains the same. It just means that some of the image regions will be cropped out and some of the regions of the new image will need to be filled. We'll get into that later.

The next transformation we will cover is the horizontal shift. Horizontal shift slides the image to the right or to the left. A similar transformation is the vertical shift, which shifts the image along the vertical axis, either up or down. The parameter through which we can control the range of shift is called the height shift generator and is measured as a fraction of total height.

Shear mapping, or shearing, displaces each point in a vertical direction by an amount proportional to its distance from an edge of the image. Note that, in general, the direction does not have to be vertical and can be arbitrary.

Zoom is a transformation that zooms the initial image in or out. The zoom range parameter controls the zooming factor.

A horizontal flip swaps the image with respect to the vertical axis. One can either turn it on or off using the horizontal flip parameter, which is a Boolean value.

Finally, there is a vertical flip. A vertical flip flips the image with regard to the horizontal axis. The vertical flip Boolean parameter controls whether or not this transformation is used.

That's it for this series of functions. If your data set contains staged or posed photos, augmenting your data set will help prepare your model for a real-world scenario or if you need to increase the size of your data set.

Thanks for watching. Be sure to check out the links to read the article associated with this series and to learn more about AI. Also, stay tuned for the next episode where we combine and implement what we've learned so far in the series.