python - multiple gpus and AdamOptimizer in tensorflow -
i'm practicing tensorflow multiple gpus. averaging gradients computed each gpu. doesn't work when optimizer adamoptimizer. works when i'm using gradientdescent.
here code:
g = tf.graph() g.as_default(), tf.device('/cpu:0'): full_data_dims = [batch_size*num_gpus] + data_dims data = tf.placeholder(dtype=tf.float32, shape=full_data_dims, name='data') labels = tf.placeholder(dtype=tf.int32, shape=[batch_size*num_gpus], name='labels') split_data = tf.split(data, num_gpus, axis=0) split_labels = tf.split(labels, num_gpus, axis=0) optimizer = tf.train.adamoptimizer(learning_rate) replica_grads = [] in range(num_gpus): tf.name_scope('tower_{}'.format(i)), tf.device('/gpu:{}'.format(i)): model = build_model(split_data[i], split_labels[i]) loss = model['loss'] grads = optimizer.compute_gradients(loss) replica_grads.append(grads) tf.get_variable_scope().reuse_variables() tf.get_variable_scope().reuse_variables() average_grad = average_gradients_layer(replica_grads) grad_step = optimizer.apply_gradients(average_grad) train_step = tf.group(grad_step) init = tf.global_variables_initializer() # part3 config_proto = tf.configproto(allow_soft_placement=true) sess = tf.session(graph=g, config=config_proto) sess.run(init) tf.train.start_queue_runners(sess=sess) sess.as_default(): step in range(num_steps): data_batch, label_batch = batch_maker(x_ok, y_ok, x_ng, y_ng, batch_size*num_gpus) results = sess.run([train_step, loss], feed_dict={data : data_batch, labels : label_batch}) if step % flag == 0: print('\n') print('step : %s loss : %s' % (step, results[1])) sys.stdout.write('\r'+str(step)+'/'+str(num_steps))
here error message :
32 tf.get_variable_scope().reuse_variables() 33 average_grad = average_gradients_layer(replica_grads) ---> 34 grad_step = optimizer.apply_gradients(average_grad) 35 train_step = tf.group(grad_step) 36 init = tf.global_variables_initializer() variable conv1_1/weight/adam/ not exist, or not created tf.get_variable(). did mean set reuse=none in varscope?
it seems adamoptimizer seeks additional '/adam/' after variable name. can 1 fix it?
i don't know if there bug or not, question "can fix it". yes.
encapsulate gpu loop (but not apply_gradients code) "with tf.variable_scope" contextmanager scope stops being reused once gpu loop exited.
Comments
Post a Comment