본문 바로가기

카테고리 없음

[TPU] DataLossError: truncated record at 269402 [[{{node MultiDeviceIteratorGetNextFromShard}}]] [[RemoteCall]]

DataLossError                             Traceback (most recent call last)
<ipython-input-43-01c4753530b2> in <module>()
      9     total_loss = 0
     10 
---> 11     for (batch, (img_tensor, target)) in enumerate(train_dist_ds):
     12         strategy.run(train_step_fn, args=(img_tensor, target))
     13 

8 frames
/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)

DataLossError: truncated record at 269402
	 [[{{node MultiDeviceIteratorGetNextFromShard}}]]
	 [[RemoteCall]] 

1천만개 25Epoch을 돌린 모델에서 데이터를 700만개 더 추가해서 돌리던 도중 Error가 발생했다.