python - After the training is finished, data remains in the shuffle_batch queue -


when run training code, ends warning

w tensorflow/core/framework/op_kernel.cc:1152] out of range: fifoqueue '_0_tower_0/input_producer' closed , has insufficient elements (requested 1, current size 0)          [[node: tower_0/readerreadv2 = readerreadv2[_device="/job:localhost/replica:0/task:0/cpu:0"](tower_0/tfrecordreaderv2, tower_0/input_producer)]]` 

also, shuffle batch log: shows not totally dequeued until end of training.

does knows why program quits before consuming remaining data in shuffle_batch queue?


below code simplified version of training model.

i'm storing images , labels(1000~2000 samples) in single tfrecord file.

so, set flags.datafile = ['some.tfrecord'], flags.max_epoch=1000, , flags.batch_size=350

def read_and_crop(fqueue):     reader = tf.tfrecordreader()     key, data = reader.read(fqueue)     features = tf.parse_single_example(             data,             features={                 'name': tf.fixedlenfeature([], tf.string),                 'image': tf.fixedlenfeature([], tf.string),                 })     decoded_image = tf.image.decode_jpeg(features['image'], channels=3)     return decoded_image, features['name']  def batch_input(flags, min_after_dequeue=3000, num_threads=3):     fqueue = tf.train.string_input_producer(             flags.datafile,             num_epochs=flags.max_epoch,             )     image, name = read(fqueue)     capacity = min_after_dequeue+int(num_threads*1.5)*flags.batch_size     images, names = tf.train.shuffle_batch(             [image, name],             batch_size=flags.batch_size,              capacity=capacity,             min_after_dequeue=min_after_dequeue,             num_threads=num_threads,             allow_smaller_final_batch=true)     return images, names     def train():     tf.graph().as_default():         global_step = tf.train.create_global_step()         images, names = batch_input(flags)         logits = my_model.inference(images)         loss = my_model.loss(logits, names)         train_op = my_model.train(loss)          init = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())         tf.session() sess:             sess.run(init)             coord = tf.train.coordinator()             threads = tf.train.start_queue_runners(sess=sess, coord=coord)             try:                 while not coord.should_stop():                     sess.run(train_op)             except tf.errors.outofrangeerror:                 pass             finally:                 coord.request_stop()             coord.join(threads) 


Comments

Popular posts from this blog

c# - Update a combobox from a presenter (MVP) -

How to understand 2 main() functions after using uftrace to profile the C++ program? -

How to put a lock and transaction on table using spring 4 or above using jdbcTemplate and annotations like @Transactional? -