๐Ÿฅ

[Spark] ์ŠคํŒŒํฌ์˜ Executor Memory ๊ตฌ์กฐ ๋ณธ๋ฌธ

๋ฐ์ดํ„ฐ/Spark

[Spark] ์ŠคํŒŒํฌ์˜ Executor Memory ๊ตฌ์กฐ

•8• 2024. 3. 23. 15:29

์ฐธ๊ณ : https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794

 

Spark Executor Memory ๊ตฌ์กฐ

https://medium.com/analytics-vidhya/spark-memory-management-583a16c1253f

Executor Container์˜ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ๋Š” ์ด๋ฏธ์ง€์™€ ๊ฐ™์ด ํฌ๊ฒŒ

  • Memory Inside of JVM: spark.executor.memory
  • Memory Outside of JVM: spark.yarn.executor.MemroyOverHead

๋กœ ๋‚˜๋‰œ๋‹ค.

Object์˜ ์ฝ๊ธฐ/์“ฐ๊ธฐ ์†๋„๋Š” `on-heap > off-heap > disk` ์ˆœ์„œ๋กœ ๋น ๋ฅด๋‹ค.

Memory Inside of JVM (=On-Heap Memory = In-Memory)

java GC ๊ด€๋ฆฌ ํ•˜์— ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์˜์—ญ์ด๋‹ค. ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋Œ€๋ถ€๋ถ„์— ๋ฐ์ดํ„ฐ๊ฐ€ on-heap ๋ฉ”๋ชจ๋ฆฌ์— ์ €์žฅ๋œ๋‹ค.

`spark.executor.memory`๋Š” ์ŠคํŒŒํฌ์˜ ์„ค์ • ์˜ต์…˜ ์ค‘ ํ•˜๋‚˜๋กœ, executor ์— ํ• ๋‹น๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์„ค์ •ํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋˜๋Š” conf๊ฐ’์ด๋‹ค. ๋ฉ”๋ชจ๋ฆฌ ๊ฐ’์ด ๋„ˆ๋ฌด ์ž‘์œผ๋ฉด ์ž‘์—…์ด ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ๋ชปํ•ด ์„ฑ๋Šฅ ์ €ํ•˜ ๋˜๋Š” OOM์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ณ , ๋„ˆ๋ฌด ํฌ๋ฉด ํด๋Ÿฌ์Šคํ„ฐ์˜ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋†’์•„์ ธ ๋‹ค๋ฅธ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ๋‹ค.

https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794

on-heap ๋ฉ”๋ชจ๋ฆฌ์—์„œ๋Š” ์„ธ ๊ฐ€์ง€ ๋ฉ”๋ชจ๋ฆฌ ์˜์—ญ์ด ์žˆ๋Š”๋ฐ ์œ„ ์ด๋ฏธ์ง€์™€ ๊ฐ™๋‹ค.

Spark Memory

์ŠคํŒŒํฌ์—์„œ ๊ด€๋ฆฌํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์˜์—ญ์œผ๋กœ, ์ŠคํŒŒํฌ๊ฐ€ ์ธ๋ฉ”๋ชจ๋ฆฌ ์ปดํ“จํŒ…์„ ํ•  ๋•Œ ์ค‘๊ฐ„ ์ƒํƒœ๋ฅผ spark memory์— ์ €์žฅํ•œ๋‹ค.

(Java Heap — Reserved Memory) * spark.memory.fraction

์ฒซ ๋ฒˆ์งธ ์ด๋ฏธ์ง€ ์‚ฌ์ง„์—์„œ ๋ณด๋ฉด spark memory ์˜์—ญ์ด ๋‘ ๊ฐœ๋กœ ๋‚˜๋‰˜์–ด ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

  • `spark.memory.fraction - spark.memory.storageFraction`: Execution
  • `spark.memory.storageFraction`: Storage

fraction ์˜์—ญ์ด ์…”ํ”Œ, ์กฐ์ธ, ์ •๋ ฌ, ์ง‘๊ณ„ ๋“ฑ ์—ฐ์‚ฐ ์ž‘์—…์— ์‚ฌ์šฉ๋˜๊ณ , storageFraction ์˜์—ญ์œผ๋กœ ํ‘œ์‹œ๋œ ๋ถ€๋ถ„์ด ๋ฐ์ดํ„ฐ ํŒŒํ‹ฐ์…˜์„ ์บ์‹œํ•˜๋Š” ๋ฐ์— ์‚ฌ์šฉ๋œ๋‹ค.

์ด ์˜์—ญ์€ `spark.memory.stroageFraction` ๊ฐ’์œผ๋กœ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ธฐ๋ณธ๊ฐ’์€ 0.5 (50%) ์ด๋‹ค.

a. Storage Memory

storage memory๋Š” ์บ์‹œ ๋ฐ์ดํ„ฐ, broadcast variable ๋“ฑ์„ ์ €์žฅํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋œ๋‹ค. serialized data -> deserialized ๋กœ ๋ณ€ํ™˜ํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋˜๋Š” ํ”„๋กœ์„ธ์Šค์ธ " unroll" ๋„ storage memory์— ์˜ฌ๋ผ๊ฐ€์žˆ๋‹ค. 

(Java Heap — Reserved Memory) * spark.memory.fraction * spark.memory.storageFraction

b. Execution Memory

์ŠคํŒŒํฌ ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋Š” ๋™์•ˆ ํ•„์š”ํ•œ object๋ฅผ ์ €์žฅํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋œ๋‹ค.

์…”ํ”Œ, ์กฐ์ธ, ์†ŒํŠธ, ์ง‘๊ณ„ ๋“ฑ ์—ฐ์‚ฐ ์‹œ ์ค‘๊ฐ„ ๋ฒ„ํผ(์ž„์‹œ ๋ฐ์ดํ„ฐ)๋ฅผ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ํ•ด์‹œ ํ…Œ์ด๋ธ”์„ execution memory ์— ์ €์žฅํ•œ๋‹ค. 

storage memory๋ณด๋‹ค ์ˆ˜๋ช…์ด ์งง์•„์„œ task๊ฐ€ ๋๋‚˜๋ฉด ๋ฐ”๋กœ ๋น„์›Œ์ง€๊ณ  ๋‹ค์Œ task ์ž‘์—…์„ ์œ„ํ•ด์„œ ๊ณต๊ฐ„์„ ๋งŒ๋“ค์–ด์ค€๋‹ค.

(Java Heap — Reserved Memory) * spark.memory.fraction * (1.0 - spark.memory.storageFraction)

 

 

User Memory

user memory๋Š” user defined data structure,  ์ŠคํŒŒํฌ ๋‚ด๋ถ€ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ, UDF, RDD ๋ณ€ํ™˜์ž‘์—…์— ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ(RDD ์ข…์†์„ฑ ๋“ฑ)๋ฅผ ์ €์žฅํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ์ด๋‹ค.

(Java Heap — Reserved Memory) * (1.0 — spark.memory.fraction)

 

Reserved Memory

์‹œ์Šคํ…œ์šฉ์œผ๋กœ ์˜ˆ์•ฝ๋œ ๋ฉ”๋ชจ๋ฆฌ ์˜์—ญ์œผ๋กœ, Spark์˜ ๋‚ด๋ถ€ object๋ฅผ ์ €์žฅํ•˜๋Š”๋ฐ์— ์‚ฌ์šฉ๋œ๋‹ค. 

300MB๋กœ ํ•˜๋“œ์ฝ”๋”ฉ ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๊ด€๋ จ ๊ฐ’์„ ๋ฐ”๊พธ๊ณ  ์ŠคํŒŒํฌ๋ฅผ ๋‹ค์‹œ ์ปดํŒŒ์ผํ•ด์„œ ๋ณ€๊ฒฝ๋œ ๊ฐ’์„ ์‚ฌ์šฉํ•  ์ˆ˜๋Š” ์žˆ์ง€๋งŒ ์šด์˜ ํ™˜๊ฒฝ์—์„œ๋Š” ์ถ”์ฒœํ•˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ํ•œ๋‹ค.

// https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/memory/UnifiedMemoryManager.scala#L198
val RESERVED_SYSTEM_MEMORY_BYTES = 300 * 1024 * 1024

 

์ถ”๊ฐ€๋กœ `spark.executor.memory` < `Reserved Memory * 1.5 ` ์ธ ๊ฒฝ์šฐ session์„ ์ดˆ๊ธฐํ™”ํ•˜์ง€ ๋ชปํ•˜๊ณ  ์‹คํŒจ๋œ๋‹ค.

 

 

Memory Outside of JVM (=Off-Heap Memory = External-Memory)

* ์ฐธ๊ณ 

์ŠคํŒŒํฌ 2.2 ์ดํ•˜: `spark.yarn.executor.memoryOverhead`

์ŠคํŒŒํฌ 2.3 ์ด์ƒ: `spark.executor.memoryOverhead`

cluster manager๊ฐ€ yarn ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ kubernetes ๋“ฑ ๋‹ค์–‘ํ•˜๊ฒŒ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— yarndp ์ข…์†๋œ๋“ฏํ•œ ์„ค์ •๋ช…์„ ์ผ๋ฐ˜์  ์„ค์ •๋ช…์œผ๋กœ ๋ณ€๊ฒฝํ–ˆ๋‹ค๊ณ  ํ•จ

 

๋ฆฌ์†Œ์Šค ๋งค๋‹ˆ์ € ๊ด€๋ฆฌ ํ•˜์˜ ๋ฉ”๋ชจ๋ฆฌ

JVM GC ๊ด€๋ฆฌ ๋ฒ”์œ„ ๋ฐ–์— ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์˜์—ญ์ด๋‹ค.

๊ธฐ๋ณธ์ ์œผ๋กœ executor memory๋ฅผ ์„ค์ •ํ•˜๋ฉด memoryOverhead ํฌ๊ธฐ๋Š” ์•„๋ž˜์™€ ๊ฐ™์ด ์„ค์ •๋œ๋‹ค.

MAX(spark.executor.memory*0.1, 384MB)

 

`spark.executor.memoryOverhead` ๊ฐ’ ์ง€์ •์„ ํ†ตํ•ด ์ง์ ‘ ์„ค์ •ํ•ด์ค„ ์ˆ˜๋„ ์žˆ๋‹ค.

 

On-Heap๊ณผ Off-Heap

๋งŒ์•ฝ executor-memory=5g๋กœ ์„ค์ •ํ–ˆ๋‹ค๋ฉด

  • on-heap: 5G
  • off-heap: min(5G*0.1, 384MB) = 500MB

์ด 5GB + 500MB๋ฅผ ํ• ๋‹น๋ฐ›๊ฒŒ ๋œ๋‹ค.

 

๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑํ•˜๋‹ค๋Š” ๋ฉ”์‹œ์ง€๊ฐ€ ์žˆ์„ ๋•Œ ์–ด๋–ค ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋Š˜๋ ค์•ผ ํ• ์ง€ ๊ณ ๋ฏผ์ด ๋  ๋•Œ๊ฐ€ ์žˆ๋‹ค.

์•„๋ž˜์˜ ๊ฒฝ์šฐ์—๋Š” executor-memory๋ฅผ ๋Š˜๋ ค์•ผ ํ•œ๋‹ค.

  • GC๊ฐ€ ์ž์ฃผ ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ: on heap ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑํ•ด์„œ ๊ฐ€๋น„์ง€ ์ปฌ๋ ‰์…˜์ด ์ž์ฃผ ๋ฐœ์ƒํ•œ๋‹ค๋Š” ๋œป์ด๋ฏ€๋กœ on heap ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋Š˜๋ ค์•ผํ•œ๋‹ค.

๋ฐ˜๋Œ€๋กœ ์•„๋ž˜์˜ ๊ฒฝ์šฐ์—๋Š” memoryOverhead๋ฅผ ๋Š˜๋ ค์•ผ ํ•œ๋‹ค.

  • GC๋Š” ์ž์ฃผ ๋ฐœ์ƒํ•˜์ง€ ์•Š์ง€๋งŒ cluster manager์— ์˜ํ•ด executor ๊ฐ€ ์ž๊พธ ์ฃฝ์„ ๊ฒฝ์šฐ
    Reason: Container killed by YARN for exceeding memory limits.
    5.5 GB of 5.5 GB physical memory used.
    Consider boosting spark.yarn.executor.memoryOverhead.โ€‹

Spark Core ๊ด€๋ฆฌ ํ•˜์˜ ๋ฉ”๋ชจ๋ฆฌ

Off-heap ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ จ ์„ค์ • ๊ฐ’์ด ์•„๋ž˜์™€ ๊ฐ™์ด ์ถ”๊ฐ€๋กœ ๋” ์žˆ๋‹ค.

  • spark.memory.offHeap.enabled: ์˜คํ”„ํž™ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ ์—ฌ๋ถ€ (๋””ํดํŠธ๋Š” false)
  • spark.memory.offHeap.size: ์˜คํ”„ ํž™ ์‚ฌ์ด์ฆˆ (๋””ํดํŠธ 0)

spark.yarn.executor.memoryOverhead ์™€์˜ ์ฐจ์ด์ ์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค:

  • spark.yarn.executor.memoryOverhead: yarn๊ณผ ๊ฐ™์€ ๋ฆฌ์†Œ์Šค ๋งค๋‹ˆ์ €์— ์˜ํ•ด ๊ด€๋ฆฌ๋จ
  • spark.memory.offHeap.size: spark core์— ์˜ํ•ด ๊ด€๋ฆฌ๋จ (tungsten ์—”์ง„)

spark.yarn.executor.memoryOverhead ์™€ spark.memory.offHeap.size ์˜ ๊ด€๊ณ„๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

  • spark 1.x and 2.x: ์ „์ฒด Off-heap ๋ฉ”๋ชจ๋ฆฌ = spark.yarn.executor.memoryOverhead
     spark.yarn.executor.memoryOverhead ๊ฐ€ spark.memory.offHeap.size๋ฅผ ํฌํ•จํ•จ
  • spark 3.x: ์ „์ฒด Off-heap ๋ฉ”๋ชจ๋ฆฌ = spark.yarn.executor.memoryOverhead + spark.memory.offHeap.size

์ฐธ๊ณ : https://stackoverflow.com/questions/58666517/difference-between-spark-yarn-executor-memoryoverhead-and-spark-memory-offhea

 

Spark์˜ Tungsten ์—”์ง„์€ ๊ฐ€๋น„์ง€ ์ปฌ๋ ‰์…˜ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š”๋ฐ ๊ธฐ์—ฌํ•œ๋‹ค. GC ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์–ด๋–ป๊ฒŒ ์ตœ์†Œํ™”ํ•œ๋‹ค๋Š” ๊ฒƒ์ผ๊นŒ,,

์ผ๋ฐ˜์ ์œผ๋กœ JVM์—์„œ GC ๊ฐ€ ๋ฐœ์ƒํ•  ๋•Œ stop-the-world ๋ผ๊ณ  ํ‘œํ˜„ํ•˜๋Š” ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ค‘๋‹จ ํ˜„์ƒ์ด ์ผ์–ด๋‚œ๋‹ค. ์ด ๊ณผ์ •์€ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ์‹คํ–‰์„ ์ง€์—ฐ์‹œํ‚ค๊ณ  ์ „์ฒด ์‹คํ–‰ ์‹œ๊ฐ„์„ ๋Š˜๋ฆฐ๋‹ค.

Tungsten์—์„œ๋Š” (c์˜ malloc ๊ฐ™์€) sun.misc.Unsafe ๊ธฐ๋ฐ˜์œผ๋กœ off heap์— ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค. 

Off heap ๋ฉ”๋ชจ๋ฆฌ๋Š” JVM ๋ฐ”๊นฅ์— ์กด์žฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— GC ๊ด€๋ฆฌ ๋ฐ–์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด stop-the-world ๋ฅผ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋ณผ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.

๋‹จ์ ์€ ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

  • on-heap ๋ฉ”๋ชจ๋ฆฌ๋ณด๋‹ค ์ฝ๊ณ  ์“ฐ๊ธฐ๊ฐ€ ๋Š๋ฆฌ๋‹ค.
  • ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ง์ ‘ ์ˆ˜๋™์œผ๋กœ ๊ด€๋ฆฌํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.

์‹คํ–‰ ์ค‘ GC ๋‹จ๊ณ„๊ฐ€ ๋„ˆ๋ฌด ๊ธธ์–ด์ง€๋ฉด ์ฐพ์•„๋ณผ๋งŒํ•˜๋‹ค.

'๋ฐ์ดํ„ฐ > Spark' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Spark] spark-submit ๊ณผ ์˜ต์…˜  (1) 2024.03.24
[Spark] Adaptive Query Execution(AQE)  (0) 2024.03.23
[Spark] GraphX  (0) 2024.03.18
[Spark] Spark Join ์ข…๋ฅ˜  (0) 2024.03.18
[Spark] spark์—์„œ s3 ์ ‘๊ทผํ•˜๊ธฐ (ACCESS_KEY, SECRET_KEY)  (0) 2023.12.19