본문 바로가기
빅데이터/spark

[spark] NoSuchElement 오류 해결 방법

by hs_seo 2018. 1. 8.

spark 처리중 다음과 같은 오류가 발생하는 경우가 있다. 

아마도 shuffle 처리중 메모리가 부족해서 발생하는 것으로 추정된다. 


이럴때는 spark.sql.shuffle.partitions 설정을 추가하면 된다. 

다음 설정을 추가하여 처리하였다. 


spark.sql.shuffle.partitions=300

spark.default.parallelism=300


diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 2.0 failed 4 times, most recent failure: Lost task 1.3 in stage 2.0 (TID 436: java.io.IOException: java.util.NoSuchElementException

 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1283)

 at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:174)

 at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:65)

 at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:65)

 at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:89)

 at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)

 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:72)

 at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)

 at org.apache.spark.scheduler.Task.run(Task.scala:86)

 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)

 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

 at java.lang.Thread.run(Thread.java:745)

Caused by: java.util.NoSuchElementException

 at org.apache.spark.util.collection.PrimitiveVector$$anon$1.next(PrimitiveVector.scala:58)

 at org.apache.spark.storage.memory.PartiallyUnrolledIterator.next(MemoryStore.scala:697)

 at org.apache.spark.util.CompletionIterator.next(CompletionIterator.scala:30)

 at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1$$anonfun$2.apply(TorrentBroadcast.scala:178)

 at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1$$anonfun$2.apply(TorrentBroadcast.scala:178)

 at scala.Option.map(Option.scala:146)

 at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:178)

 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1276)


반응형