Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.6k views
in Technique[技术] by (71.8m points)

python - How do I make a copy of a GradientTape object in tensorflow? (Monte Carlo App)

I have an expensive initialization op before I perform my calculation in tensorflow.

My code looks something like this:

x = tf.Variable(2.0)
w = tf.Variable(5.0)

with tf.GradientTape() as tape:
  tape.watch(x)
  tape.watch(w)
  y = x ** 2
  z = w ** 3
  o = tf.math.log(y*z) # note that this step is the arbitrarily complex init code

# now i need to run a loop n times (here n is 10)
res = []
for i in range(10):
  with tape:
    z =tf.random.normal([1,10])
    f = tf.reduce_sum(x*z, axis=1)*o+w
  df = tape.gradient(f, {'x':x, 'w':w})
  res.append(df)

Basically I'm trying to run a monte carlo simulation and need the gradients without having to run the initialization code on every loop. This code works fine if n==1 but gives the wrong answers if n>=2.

What i need is a way to copy the state of tape before I start the monte carlo loop. so instead of saying "with tape" something like:

  with tf.GradientTape(tape) as tape2:
    ...
  df = tape2.gradient(f, {'x':x, 'w':w})

Is this possible? How can I achieve something similar?

As a second part to the question, I've noticed that even if I recalculate the value of o in the main loop, tensorflow only works if the tape is not persistent. If it is - i run out of memory on the GPU after several iterations of the loop. This is not ideal as I'd like to also define other functions that depend on x and w and record their gradients also.

i.e. if i do this:

res = []
for i in range(10):
  with tf.GradientTape(persistent=True) as tape:
    z =tf.random.normal([1,10])

    # rerun init for every loop
    y = x ** 2
    t = w ** 3
    o = tf.math.log(y*t)

    f = tf.reduce_sum(x*z, axis=1)*o+w
    g = tf.reduce_sum(x*z+w, axis=1)*o

  df = tape.gradient(f, {'x':x, 'w':w})
  dg = tape.gradient(g, {'x':x, 'w':w})
  res.append([df,dg])

I don't understand this behavior - surely the tape is being discarded after each iteration of the loop (and thus it shouldn't matter if it was persistent or not)?

Thanks


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)
等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...