python - Distributed Tenserflow: Cannot colocate nodes and Cannot merge devices with incompatible tasks -
if job_name == "ps": server.join() elif job_name == "worker": tf.device( tf.train.replica_device_setter( worker_device="/job:worker/task:%d" % task_index, merge_devices = false, cluster=cluster, ps_strategy=greedy ) ): norm_prjct_op = _norm_projected_cxx()
normprjctop
self-defined c++ operation, , program ran 1 parameter server. however, after added parameter server, got error:
cannot colocate nodes 'normprjctop' , 'h_grad/shape: cannot merge devices incompatible tasks: '/job:ps/task:1' , '/job:ps/task:0' [[node: normprjctop = normprjctop[ _class=["loc:@entitys", "loc:@relations", "loc:@mh", "loc:@mt"], _device="/job:ps/task:0"](mh, mt, relations, entitys)]]
in tf.train.replica_device_setter
, tried greedyloadbalancingstrategy
, merge_devices = false
none of them worked.
any advice? thanks!
Comments
Post a Comment