질문

얼굴 키포인트 회귀를 실행하려고합니다. 이미지와 레이블이 인코딩 된 TFRecord 파일을 성공적으로 만들었습니다 (라벨은 얼굴 kypoint입니다).

그런 다음 데이터 (이미지 및 키 포인트)를 메모리에로드하기 시작했습니다 (여기 https://gist.github.com/FirefoxMetzger/c143c340c71e85c0c23c7ced94a88c16#file-faster_fully_connected_reader-py ). 먼저 모든 이미지를 일괄 처리 한 다음 해당 안내서에 설명 된대로 이미지를 디코딩하고 싶었습니다. 그러나 이것은 작동하지 않습니다. 내 이해가 정확하면 배치가 아닌 단일 이미지에서만 tf.image.decode_image () 만 사용할 수 있습니다. 이해가 정확합니까? 그렇다면 어떻게 이미지 배치를 디코딩 할 수 있습니까?

미리 감사드립니다!

코드는 다음과 같습니다.

ds = tf.data.TFRecordDataset(TFR_FILENAME)

ds = ds.repeat(EPOCHS)

ds = ds.shuffle(BUFFER_SIZE + BATCH_SIZE)

ds = ds.batch(BATCH_SIZE)

finally I tried to decode the image using tf.image.decode_image()

feature_description = {'height': tf.io.FixedLenFeature([], tf.int64),
                    'width': tf.io.FixedLenFeature([], tf.int64),
                    'depth': tf.io.FixedLenFeature([], tf.int64),
                    'kpts': tf.io.FixedLenFeature([136], tf.float32),
                    'image_raw': tf.io.FixedLenFeature([], tf.string),
                    }

for record in ds.take(1):
    record = tf.io.parse_example(record, feature_description)
    decoded_image = tf.io.decode_image(record['image_raw'], dtype=tf.float32)

다음과 같은 ValueError가 발생합니다.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-3583be9c40ab> in <module>()
      1 for record in ds.take(1):
      2     record = tf.io.parse_example(record, feature_description)
----> 3     decoded_image = tf.io.decode_image(record['image_raw'], dtype=tf.float32)

3 frames
/tensorflow-2.0.0/python3.6/tensorflow_core/python/ops/image_ops_impl.py in decode_image(contents, channels, dtype, name, expand_animations)
   2315     # as well as JPEG images with EXIF data (start with \xff\xd8\xff\xe1).
   2316     return control_flow_ops.cond(
-> 2317         is_jpeg(contents), _jpeg, check_png, name='cond_jpeg')
   2318 
   2319 

/tensorflow-2.0.0/python3.6/tensorflow_core/python/util/deprecation.py in new_func(*args, **kwargs)
    505                 'in a future version' if date is None else ('after %s' % date),
    506                 instructions)
--> 507       return func(*args, **kwargs)
    508 
    509     doc = _add_deprecated_arg_notice_to_docstring(

/tensorflow-2.0.0/python3.6/tensorflow_core/python/ops/control_flow_ops.py in cond(pred, true_fn, false_fn, strict, name, fn1, fn2)
   1199   with ops.name_scope(name, "cond", [pred]):
   1200     if context.executing_eagerly():
-> 1201       if pred:
   1202         result = true_fn()
   1203       else:

/tensorflow-2.0.0/python3.6/tensorflow_core/python/framework/ops.py in __bool__(self)
    874 
    875   def __bool__(self):
--> 876     return bool(self._numpy())
    877 
    878   __nonzero__ = __bool__

답변1

실제로 decode_image 는 단일 이미지에서만 작동합니다. 일괄 처리하기 전에 데이터 집합에서 디코딩을 수행하여 적절한 성능을 얻을 수 있습니다.

이와 같은 것 (코드가 테스트되지 않았으므로 약간의 조정이 필요할 수 있음) :

ds = tf.data.TFRecordDataset(TFR_FILENAME)

def parse_and_decode(record):
  record = tf.io.parse_example(record, feature_description)
  record['image'] = tf.io.decode_image(record['image_raw'], dtype=tf.float32)
  return record

ds = ds.map(parse_and_decode)

ds = ds.repeat(EPOCHS)

ds = ds.shuffle(BUFFER_SIZE + BATCH_SIZE)
...

출처 : https://stackoverflow.com/questions/59202302/unable-to-decode-batch-of-images-with-tf-io-decode-image