Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
708 views
in Technique[技术] by (71.8m points)

tensorflow - Keras Data Augmentation with ImageDataGenerator (Your input ran out of data)

I am currently learning how to perform data augmentation with Keras ImageDataGenerator from "Deep learning with Keras" by Fran?ois Chollet.

I now have 1000 (Dogs) & 1000 (Cats) images in training dataset.

I also have 500(Dogs) & 500(Cats) images in validation dataset.

The book defined the batch size as 32 for both training and validation data in the Generator to perform data augmentation with both "step_per_epoch" and "epoch" in fitting the model.

Hpwever, when I train the model, I received the Tensorflow Warning, "Your input ran out of data..." and stopped the training process.

I searched online and many solutions mentioned that the step_per_epoch should be, steps_per_epoch = len(train_dataset) // batch_size & steps_per_epoch = len(validation_dataset) // batch_size

I understand the logic above and there is no warning in the training.

But I am wondering, originally I have 2000 training samples. This is too little so that I need to perform data augmentation to increase the numbers of training images. If the steps_per_epoch = len(train_dataset) // batch_size is applied, since the len(train_dataset) is only 2000. Isn't that I am still using 2000 samples to train the model instead of adding more augmented images to the model?

train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=32,
class_mode='binary')

history = model.fit_generator(
train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=validation_generator,
validation_steps=50)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The fact that, imagedatagenerator does not increase the size of the training set. All augmentations are done in memory. So an original image is augmented randomly, then its augmented version is returned. If you want to have a look to augmented images you need set these parameters for the function flow_from_directory:

save_to_dir=path,
save_prefix="",
save_format="png",

Now you have 2000 images and with a batch size of 32, you would have 2000 // 32 = 62 steps per epoch, but you are trying to have 100 steps which is causing the error. If you want to use all data points, then you should set:

steps_per_epoch = len(train_dataset) // batch_size

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...