๋ชฉ๋ก์ ์ฒด ๊ธ (59)
๐ฅ
[parquet-tools] parquet-tools schema myfile.parquet --> ํ์ผ์ดํ์ผ ์คํค๋ง ์ถ๋ ฅ parquet-tools meta myfile.parquet --> ๋ฉํ๋ฐ์ดํฐ ์ถ๋ ฅ parquet-tools cat myfile.parquet --> ํ์ผ ๋ด์ฉ ์ถ๋ ฅ
yarn ์ ๋ค์ด๊ฐ์๋ ์ดํ๋ฆฌ์ผ์ด์ ์ด ํ๋๋ ์๋๊ณ ๋๊ธฐ์ค์ด์ด์ Resorce Manager ์ญํ ๋ก๊ทธ๋ฅผ ๋ดค๋๋ ์๋์ ๊ฐ์ ์๋ฌ๋ฉ์์ง ํ์ธ.. Error trying to assign container token and NM token to an updated container CONTAINER_NAME java.lang.IllegalArgumentException: java.net.UnknownHostException: HOST_NAME at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:445) at org.apache.hadoop.yarn.server.utils.BuilderUtils.newContainerT..
sparkConf = SpartConf().setAppName("test") sc = SparkContext.getOrCreate(conf=spartConf) hc = HiveContext(sc) df = hc.read.option("basePath", '/Path-to-data/')\ .parquet('/Path-to-data/') /Path-to-data/partition1=x/partition2=y ๋๋ ํ ๋ฆฌ๊ฐ ์ด๋ฐ ๊ตฌ์กฐ๋ก ๋์ด์์ ๋ ์์ ๊ฐ์ด ๋ฐ์ดํฐ ๋ก๋ ์ basePath ์ต์ ์ ์ถ๊ฐํ๋ฉด ํํฐ์ ์ ๋ณด(์์ ์ฝ๋์์๋ partition1, partition2) ๊ฐ dataframe์ ์ปฌ๋ผ์ผ๋ก ๋ก๋๋๋ค.
https://velog.io/@andrewyoon10/VSCode%EC%97%90%EC%84%9C-CC-%EC%BB%B4%ED%8C%8C%EC%9D%BC-%EB%B0%8F-%EB%94%94%EB%B2%84%EA%B9%85-%ED%99%98%EA%B2%BD-%EB%A7%8C%EB%93%A4%EA%B8%B0 VSCode์์ C/C++ ์ปดํ์ผ ๋ฐ ๋๋ฒ๊น ํ๊ฒฝ ๋ง๋ค๊ธฐ ์ด๋ฒ์๋ Winodows์์ Visual Studio Code์๋ํฐ์์ C/C++์ฝ๋๋ฅผ ์ปดํ์ผ ๋ฐ ๋๋ฒ๊น ํ๊ธฐ ์ํ ๊ธฐ๋ณธ์ ์ธ ๊ฐ๋ฐํ๊ฒฝ ๊ตฌ์ถ์ ๋ํด์ ํฌ์คํ ํด๋ณด๊ฒ ์ต๋๋ค. ๋จ๊ณ๋ณ๋ก ์งํ๋๋ฉฐ, ๋ฐ๋ผํ์๋ค ์ ์๋๋ ๋ถ๋ถ์ด velog.io https://webnautes.tistory.com/1158 https://urakasumi.tistory.com..
pip ์ค์น: https://quackstudy.tistory.com/13?category=801005 1. impyla ๋ผ์ด๋ธ๋ฌ๋ฆฌ ์ฌ์ฉ #pip install impyla from impala.dbapi import connect HOST = "host_ip" PORT = 21050 #default conn = connect(host=HOST, port=PORT) cursor = conn.cursor() query = "select * from default.table1 where some condition" cursor.execute(query) conn.close() 2. pyodbc ๋ผ์ด๋ธ๋ฌ๋ฆฌ ์ฌ์ฉ 1) cloudera odbc driver for impala ์ค์น https://docs.info..