Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
458 views
in Technique[技术] by (71.8m points)

tensorflow - Keras checkpoints not being saved to google cloud bucket

I'm using the following code to save checkpoints while a google cloud build runs my model:

 cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath = "gs://mybucket/checkpoints", 
                                                   verbose=0,
                                                   save_weights_only=True,
                                                   monitor='val_loss',
                                                   mode='min',
                                                   save_best_only=True)

I'm getting no errors in my build logs, but the only thing in the bucket after each run is a tf_cloud_train_tar file containing the source directory contents.

I'm using callbacks = [cp_callback] in model.fit.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I was having this problem for several reasons:

  • Dataset was not on the storage bucket, and so the code had no access to it.
  • Use of generator for dataset without files creates an infinite loop, but no crash.

I switched to AI Platform and sourced my data from the GCS Bucket and the problem was fixed.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

2.1m questions

2.1m answers

60 comments

57.0k users

...