python - multiple gpus and AdamOptimizer in tensorflow -


i'm practicing tensorflow multiple gpus. averaging gradients computed each gpu. doesn't work when optimizer adamoptimizer. works when i'm using gradientdescent.

here code:

g = tf.graph() g.as_default(), tf.device('/cpu:0'):     full_data_dims = [batch_size*num_gpus] + data_dims     data = tf.placeholder(dtype=tf.float32, shape=full_data_dims, name='data')     labels = tf.placeholder(dtype=tf.int32, shape=[batch_size*num_gpus], name='labels')      split_data = tf.split(data, num_gpus, axis=0)     split_labels = tf.split(labels, num_gpus, axis=0)      optimizer = tf.train.adamoptimizer(learning_rate)      replica_grads = []     in range(num_gpus):         tf.name_scope('tower_{}'.format(i)), tf.device('/gpu:{}'.format(i)):              model = build_model(split_data[i], split_labels[i])             loss = model['loss']             grads = optimizer.compute_gradients(loss)             replica_grads.append(grads)             tf.get_variable_scope().reuse_variables()       tf.get_variable_scope().reuse_variables()     average_grad = average_gradients_layer(replica_grads)     grad_step = optimizer.apply_gradients(average_grad)     train_step = tf.group(grad_step)     init = tf.global_variables_initializer()  # part3 config_proto = tf.configproto(allow_soft_placement=true) sess = tf.session(graph=g, config=config_proto) sess.run(init) tf.train.start_queue_runners(sess=sess) sess.as_default():     step in range(num_steps):         data_batch, label_batch = batch_maker(x_ok, y_ok, x_ng, y_ng, batch_size*num_gpus)         results = sess.run([train_step, loss], feed_dict={data : data_batch, labels : label_batch})         if step % flag == 0:             print('\n')             print('step : %s loss : %s' % (step, results[1]))         sys.stdout.write('\r'+str(step)+'/'+str(num_steps)) 

here error message :

 32     tf.get_variable_scope().reuse_variables()  33     average_grad = average_gradients_layer(replica_grads) ---> 34     grad_step = optimizer.apply_gradients(average_grad)  35     train_step = tf.group(grad_step)  36     init = tf.global_variables_initializer()  variable conv1_1/weight/adam/ not exist, or not created  tf.get_variable(). did mean set reuse=none in varscope? 

it seems adamoptimizer seeks additional '/adam/' after variable name. can 1 fix it?

i don't know if there bug or not, question "can fix it". yes.

encapsulate gpu loop (but not apply_gradients code) "with tf.variable_scope" contextmanager scope stops being reused once gpu loop exited.


Comments

Popular posts from this blog

Command prompt result in label. Python 2.7 -

javascript - How do I use URL parameters to change link href on page? -

amazon web services - AWS Route53 Trying To Get Site To Resolve To www -