Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
398 views
in Technique[技术] by (71.8m points)

python - Data augmentation with ImageDataGenerator for videos (4D tensors) in Keras

I have an ImageDataGenerator in Keras that I would like to apply during training to every frame in short video clips which are represented as 4D numpy arrays with shape (num_frames, width, height, 3).

In the case of a standard dataset consisting of images each with shape (width, height, 3), I would normally do something like:

aug = tf.keras.preprocessing.image.ImageDataGenerator(
        rotation_range=15,
        zoom_range=0.15)

model.fit_generator(
        aug.flow(X_train, y_train),
        epochs=100)

How can I apply these same data augmentations to a dataset of 4D numpy arrays representing sequences of images?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I figured it out. I created a custom class which inherits from tensorflow.keras.utils.Sequence that performs the augmentations using scipy for each image.

       class CustomDataset(tf.keras.utils.Sequence):
            def __init__(self, batch_size, *args, **kwargs):
                self.batch_size = batch_size
                self.X_train = args[0]
                self.Y_train = args[1]

            def __len__(self):
                # returns the number of batches
                return int(self.X_train.shape[0] / self.batch_size)

            def __getitem__(self, index):
                # returns one batch
                X = []
                y = []
                for i in range(self.batch_size):
                    r = random.randint(0, self.X_train.shape[0] - 1)
                    next_x = self.X_train[r]
                    next_y = self.Y_train[r]
                    
                    augmented_next_x = []
                    
                    ###
                    ### Augmentation parameters for this clip.
                    ###
                    rotation_amt = random.randint(-45, 45)
                    
                    for j in range(self.X_train.shape[1]):
                        transformed_img = ndimage.rotate(next_x[j], rotation_amt, reshape=False)
                        transformed_img[transformed_img == 0] = 255
                        augmented_next_x.append(transformed_img)
                
                    X.append(augmented_next_x)
                    y.append(next_y)
                    
                X = np.array(X).astype('uint8')
                y = np.array(y)

                encoder = LabelBinarizer()
                y = encoder.fit_transform(y)
                
                return X, y

            def on_epoch_end(self):
                # option method to run some logic at the end of each epoch: e.g. reshuffling
                pass

I then pass this in to the fit_generator method:

training_data_augmentation = CustomDataset(BS, X_train_L, y_train_L)
model.fit_generator(
    training_data_augmentation, 
    epochs=300)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...