다른 명령
스파크 쉘
- 스파크쉘에서 명령어는 콜론(:)과 함께 입력
scala> :help All commands can be abbreviated, e.g., :he instead of :help. :edit <id>|<line> edit history :help [command] print this summary or command-specific help :history [num] show the history (optional num is commands to show) :h? <string> search the history :imports [name name ...] show import history, identifying sources of names :implicits [-v] show the implicits in scope :javap <path|class> disassemble a file or class name :line <id>|<line> place line(s) at the end of history :load <path> interpret lines in a file :paste [-raw] [path] enter paste mode or paste a file :power enable power user mode :quit exit the interpreter :replay [options] reset the repl and replay all previous commands :require <path> add a jar to the classpath :reset [options] reset the repl to its initial state, forgetting all session entries :save <path> save replayable session to a file :sh <command line> run a shell command (result is implicitly => List[String]) :settings <options> update compiler options, if possible; see reset :silent disable/enable automatic printing of results :type [-v] <expr> display the type of an expression without evaluating it :kind [-v] <expr> display the kind of expressions type :warnings show the suppressed warnings from the most recent line which had any
설정값 변경
- 스파크 쉘에서 스파크의 설정값을 변경할 때는 :setting을 이용합니다.
scala> :settings spark.debug.maxToStringFields=100
로그 출력
- 스파크 쉘에서 로그 출력할 시 스파크 컨텍스트의 로그 레벨 변경
scala> sc res16: org.apache.spark.SparkContext = org.apache.spark.SparkContext@1fb8b4d8 # 기본 정보만 출력 scala> sc.setLogLevel("INFO") # 디버그 모드. YARN과 통신데이터가 계속 출력 scala> sc.setLogLevel("DEBUG") # 기본 설정값 scala> sc.setLogLevel("WARN")
log4j 로그 출력
- 스파크 쉘이나 다른 작업에서 로그를 출력할 때 log4j를 이용하여 로그 출력.
- {SPARK_HOEM}/conf/ 아래 log4j.properties 파일 생성
- 모든 DEBUG로그를 출력하는 예제
- 로그 출력시 spark-env.sh에 설정된 하둡 configuration의 정보에 따라 로그가 출력되지 않을 수도 있음. 설정 확인 필요
# Set everything to be logged to the console log4j.rootCategory=DEBUG, console log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n # Set the default spark-shell log level to WARN. When running the spark-shell, the # log level for this class is used to overwrite the root logger's log level, so that # the user can have different defaults for the shell and regular Spark apps. log4j.logger.org.apache.spark.repl.Main=DEBUG # Settings to quiet third party logs that are too verbose log4j.logger.org.spark_project.jetty=DEBUG log4j.logger.org.spark_project.jetty.util.component.AbstractLifeCycle=DEBUG log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=DEBUG log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=DEBUG log4j.logger.org.apache.parquet=DEBUG log4j.logger.parquet=DEBUG # SPARK-9183: Settings to avoid annoying messages when looking up nonexistent UDFs in SparkSQL with Hive support log4j.logger.org.apache.hadoop.hive.metastore.RetryingHMSHandler=DEBUG log4j.logger.org.apache.hadoop.hive.ql.exec.FunctionRegistry=DEBUG