Spark history Server浜х敓鑳屾櫙
浠tandalone榪愯妯″紡涓轟緥錛屽湪榪愯Spark Application鐨勬椂鍊欙紝Spark浼氭彁渚涗竴涓猈EBUI鍒楀嚭搴旂敤紼嬪簭鐨勮繍琛屾椂淇℃伅錛涗絾璇EBUI闅忕潃Application鐨勫畬鎴?鎴愬姛/澶?璐?鑰屽叧闂紝涔熷氨鏄錛孲park Application榪愯瀹?鎴愬姛/澶辮觸)鍚庯紝灝嗘棤娉曟煡鐪婣pplication鐨勫巻鍙茶褰曪紱
Spark history Server灝辨槸涓轟簡搴斿榪欑鎯呭喌鑰屼駭鐢熺殑錛岄氳繃閰嶇疆鍙互鍦ˋpplication鎵ц鐨勮繃紼嬩腑璁板綍涓嬩簡鏃ュ織浜嬩歡淇℃伅錛岄偅涔堝湪Application鎵ц 緇撴潫鍚庯紝WEBUI灝辮兘閲嶆柊娓叉煋鐢熸垚UI鐣岄潰灞曠幇鍑鴻Application鍦ㄦ墽琛岃繃紼嬩腑鐨勮繍琛屾椂淇℃伅錛?/p>
Spark榪愯鍦▂arn鎴栬卪esos涔嬩笂錛岄氳繃spark鐨刪istory server浠嶇劧鍙互閲嶆瀯鍑轟竴涓凡緇忓畬鎴愮殑Application鐨勮繍琛屾椂鍙傛暟淇℃伅錛堝亣濡侫pplication榪愯鐨勪簨浠舵棩蹇椾俊鎭凡緇忚褰曚笅鏉ワ級錛?/p>
閰嶇疆&浣跨敤Spark History Server
浠ラ粯璁ら厤緗殑鏂瑰紡鍚姩spark history server錛?/p>
cd $SPARK_HOME/sbin start-history-server.sh
鎶ラ敊錛?/p>
starting org.apache.spark.deploy.history.HistoryServer, logging to /home/spark/software/source/compile/deploy_spark/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-hadoop000.out failed to launch org.apache.spark.deploy.history.HistoryServer: at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:44) ... 6 more
闇瑕佸湪鍚姩鏃舵寚瀹氱洰褰曪細
start-history-server.sh hdfs://hadoop000:8020/directory
hdfs://hadoop000:8020/directory鍙互閰嶇疆鍦ㄩ厤緗枃浠朵腑錛岄偅涔堝湪鍚姩history-server鏃跺氨涓嶉渶瑕佹寚瀹氾紝鍚庣畫浠嬬粛鎬庝箞閰嶇疆錛?/p>
娉細璇ョ洰褰曢渶瑕佷簨鍏堝湪hdfs涓婂垱寤哄ソ錛屽惁鍒檋istory-server鍚姩鎶ラ敊銆?/strong>
鍚姩瀹屾垚涔嬪悗鍙互閫氳繃WEBUI璁塊棶錛岄粯璁ょ鍙f槸18080錛歨ttp://hadoop000:18080
榛樿鐣岄潰鍒楄〃淇℃伅鏄┖鐨勶紝涓嬮潰鎴浘鏄垜璺戜簡鍑犳spark-sql嫻嬭瘯鍚庡嚭鐜扮殑銆?/p>
history server鐩稿叧鐨勯厤緗弬鏁版弿榪?/strong>
1錛?spark.history.updateInterval
銆銆榛樿鍊鹼細10
銆銆浠ョ涓哄崟浣嶏紝鏇存柊鏃ュ織鐩稿叧淇℃伅鐨勬椂闂撮棿闅?/p>
2錛塻park.history.retainedApplications
銆銆榛樿鍊鹼細50
銆銆鍦ㄥ唴瀛樹腑淇濆瓨Application鍘嗗彶璁板綍鐨勪釜鏁幫紝濡傛灉瓚呰繃榪欎釜鍊鹼紝鏃х殑搴旂敤紼嬪簭淇℃伅灝嗚鍒犻櫎錛屽綋鍐嶆璁塊棶宸茶鍒犻櫎鐨勫簲鐢ㄤ俊鎭椂闇瑕侀噸鏂版瀯寤洪〉闈€?/p>
3錛?span style="color: #ff0000;">spark.history.ui.port
銆銆榛樿鍊鹼細18080
銆銆HistoryServer鐨剋eb绔彛
4錛塻park.history.kerberos.enabled
銆銆榛樿鍊鹼細false
銆銆鏄惁浣跨敤kerberos鏂瑰紡鐧誨綍璁塊棶HistoryServer錛屽浜庢寔涔呭眰浣嶄簬瀹夊叏闆嗙兢鐨凥DFS涓婃槸鏈夌敤鐨勶紝濡傛灉璁劇疆涓簍rue錛屽氨瑕侀厤緗笅闈㈢殑涓や釜灞炴?/p>
5錛塻park.history.kerberos.principal
銆銆榛樿鍊鹼細鐢ㄤ簬HistoryServer鐨刱erberos涓諱綋鍚嶇О
6錛塻park.history.kerberos.keytab
銆銆鐢ㄤ簬HistoryServer鐨刱erberos keytab鏂囦歡浣嶇疆
7錛塻park.history.ui.acls.enable
銆銆榛樿鍊鹼細false
銆銆鎺堟潈鐢ㄦ埛鏌ョ湅搴旂敤紼嬪簭淇℃伅鐨勬椂鍊欐槸鍚︽鏌cl銆傚鏋滃惎鐢紝鍙湁搴旂敤紼嬪簭鎵鏈夎呭拰spark.ui.view.acls鎸囧畾鐨勭敤鎴峰彲浠ユ煡鐪嬪簲鐢ㄧ▼搴忎俊鎭?鍚﹀垯錛屼笉鍋氫換浣曟鏌?/p>
8錛?span style="color: #ff0000;">spark.eventLog.enabled
銆銆榛樿鍊鹼細false
銆銆鏄惁璁板綍Spark浜嬩歡錛岀敤浜庡簲鐢ㄧ▼搴忓湪瀹屾垚鍚庨噸鏋剋ebUI
9錛?span style="color: #ff0000;">spark.eventLog.dir
銆銆榛樿鍊鹼細file:///tmp/spark-events
銆銆淇濆瓨鏃ュ織鐩稿叧淇℃伅鐨勮礬寰勶紝鍙互鏄痟dfs://寮澶寸殑HDFS璺緞錛屼篃鍙互鏄痜ile://寮澶寸殑鏈湴璺緞錛岄兘闇瑕佹彁鍓嶅垱寤?/p>
10錛?span style="color: #ff0000;">spark.eventLog.compress
銆銆榛樿鍊鹼細false
銆銆鏄惁鍘嬬緝璁板綍Spark浜嬩歡錛屽墠鎻恠park.eventLog.enabled涓簍rue錛岄粯璁や嬌鐢ㄧ殑鏄痵nappy
浠park.history寮澶寸殑闇瑕侀厤緗湪spark-env.sh涓殑SPARK_HISTORY_OPTS錛屼互spark.eventLog寮澶寸殑閰嶇疆鍦╯park-defaults.conf
鎴戝湪嫻嬭瘯榪囩▼涓殑閰嶇疆濡備笅錛?/p>
spark-defaults.conf
spark.eventLog.enabled true spark.eventLog.dir hdfs://hadoop000:8020/directory spark.eventLog.compress true
spark-env.sh
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=7777 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://had oop000:8020/directory"
鍙傛暟鎻忚堪錛?/p>
spark.history.ui.port=7777 璋冩暣WEBUI璁塊棶鐨勭鍙e彿涓?777
spark.history.fs.logDirectory=hdfs://hadoop000:8020/directory 閰嶇疆浜嗚灞炴у悗錛屽湪start-history-server.sh鏃跺氨鏃犻渶鍐嶆樉紺虹殑鎸囧畾璺緞
spark.history.retainedApplications=3 鎸囧畾淇濆瓨Application鍘嗗彶璁板綍鐨勪釜鏁幫紝濡傛灉瓚呰繃榪欎釜鍊鹼紝鏃х殑搴旂敤紼嬪簭淇℃伅灝嗚鍒犻櫎
璋冩暣鍙傛暟鍚庡惎鍔╯tart-history-server.sh
start-history-server.sh
璁塊棶WEBUI錛?http://hadoop000:7777
鍦ㄤ嬌鐢╯park history server鐨勮繃紼嬩腑浜х敓鐨勫嚑涓枒闂細
鐤戦棶1錛歴park.history.fs.logDirectory鍜宻park.eventLog.dir鎸囧畾鐩綍鏈夊暐鍖哄埆錛?/strong>
緇忔祴璇曞悗鍙戠幇錛?/p>
spark.eventLog.dir錛欰pplication鍦ㄨ繍琛岃繃紼嬩腑鎵鏈夌殑淇℃伅鍧囪褰曞湪璇ュ睘鎬ф寚瀹氱殑璺緞涓嬶紱
spark.history.fs.logDirectory錛歋park History Server欏甸潰鍙睍紺鴻鎸囧畾璺緞涓嬬殑淇℃伅錛?/p>
姣斿錛歴park.eventLog.dir鍒氬紑濮嬫椂鎸囧畾鐨勬槸hdfs://hadoop000:8020/directory錛岃屽悗淇敼鎴恏dfs://hadoop000:8020/directory2
閭d箞spark.history.fs.logDirectory濡傛灉鎸囧畾鐨勬槸hdfs://hadoop000:8020/directory錛屽氨鍙兘鏄劇ず鍑鴻鐩綍涓嬬殑鎵鏈堿pplication榪愯鐨勬棩蹇椾俊鎭紱鍙嶄箣浜︾劧銆?/p>
鐤戦棶2錛歴park.history.retainedApplications=3 璨屼技娌$敓鏁堬紵錛燂紵錛燂紵錛?/strong>
The History Server will list all applications. It will just retain a max number of them in memory. That option does not control how many applications are show, it controls how much memory the HS will need.
娉ㄦ剰錛氳鍙傛暟騫朵笉鏄篃欏甸潰涓樉紺虹殑application鐨勮褰曟暟錛岃屾槸瀛樻斁鍦ㄥ唴瀛樹腑鐨勪釜鏁幫紝鍐呭瓨涓殑淇℃伅鍦ㄨ闂〉闈㈡椂鐩存帴璇誨彇娓叉煋鏃㈠彲錛?/p>
姣斿璇磋鍙傛暟閰嶇疆浜?0涓紝閭d箞鍐呭瓨涓氨鏈澶氬彧鑳藉瓨鏀?0涓猘pplicaiton鐨勬棩蹇椾俊鎭紝褰撶11涓姞鍏ユ椂錛岀涓涓氨浼氳韙㈤櫎錛屽綋鍐嶆璁塊棶絎?涓猘pplication鐨勯〉闈俊鎭椂灝?span style="font-size: 14px; line-height: 1.5;">闇瑕侀噸鏂拌鍙栨寚瀹氳礬寰勪笂鐨勬棩蹇椾俊鎭潵娓叉煋灞曠ず欏甸潰銆?nbsp;
璇﹁瀹樻柟鏂囨。錛歨ttp://spark.apache.org/docs/latest/monitoring.html
浠婂ぉ鍦ㄦ祴璇晄park-sql榪愯鍦▂arn涓婄殑榪囩▼涓紝鏃犳剰闂翠粠鏃ュ織涓彂鐜頒簡涓涓棶棰橈細
spark-sql --master yarn
14/12/29 15:23:17 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:23:17 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:23:17 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:23:17 INFO Client: Setting up container launch context for our AM 14/12/29 15:23:17 INFO Client: Preparing resources for our AM container 14/12/29 15:23:17 INFO Client: Uploading resource file:/home/spark/software/source/compile/deploy_spark/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar -> hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0093/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:23:18 INFO Client: Setting up the launch environment for our AM container
鍐嶅紑鍚竴涓猻park-sql鍛戒護琛岋紝浠庢棩蹇椾腑鍐嶆鍙戠幇錛?/p>
14/12/29 15:24:03 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:24:03 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:24:03 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:24:03 INFO Client: Setting up container launch context for our AM 14/12/29 15:24:03 INFO Client: Preparing resources for our AM container 14/12/29 15:24:03 INFO Client: Uploading resource file:/home/spark/software/source/compile/deploy_spark/assembly/target/scala-2.10/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar -> hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0094/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:24:05 INFO Client: Setting up the launch environment for our AM container
鐒跺悗鏌ョ湅HDFS涓婄殑鏂囦歡錛?/p>
hadoop fs -ls hdfs://hadoop000:8020/user/spark/.sparkStaging/
drwx------ - spark supergroup 0 2014-12-29 15:23 hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0093 drwx------ - spark supergroup 0 2014-12-29 15:24 hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0094
姣忎釜Application閮戒細涓婁紶涓涓猻park-assembly-x.x.x-SNAPSHOT-hadoopx.x.x-cdhx.x.x.jar鐨刯ar鍖咃紝褰卞搷HDFS鐨勬ц兘浠ュ強鍗犵敤HDFS鐨勭┖闂淬?/p>
鍦⊿park鏂囨。(http://spark.apache.org/docs/latest/running-on-yarn.html)涓彂鐜?span style="color: #ff0000;">spark.yarn.jar灞炴э紝灝唖park-assembly-xxxxx.jar瀛樻斁鍦╤dfs://hadoop000:8020/spark_lib/涓?/p>
鍦╯park-defaults.conf娣誨姞灞炴ч厤緗細
spark.yarn.jar hdfs://hadoop000:8020/spark_lib/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar
鍐嶆鍚姩spark-sql --master yarn瑙傚療鏃ュ織錛?/p>
14/12/29 15:39:02 INFO Client: Requesting a new application from cluster with 1 NodeManagers 14/12/29 15:39:02 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 14/12/29 15:39:02 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 14/12/29 15:39:02 INFO Client: Setting up container launch context for our AM 14/12/29 15:39:02 INFO Client: Preparing resources for our AM container 14/12/29 15:39:02 INFO Client: Source and destination file systems are the same. Not copying hdfs://hadoop000:8020/spark_lib/spark-assembly-1.3.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar 14/12/29 15:39:02 INFO Client: Setting up the launch environment for our AM container
瑙傚療HDFS涓婃枃浠?/p>
hadoop fs -ls hdfs://hadoop000:8020/user/spark/.sparkStaging/application_1416381870014_0097
璇pplication瀵瑰簲鐨勭洰褰曚笅娌℃湁spark-assembly-xxxxx.jar浜嗭紝浠庤岃妭鐪乤ssembly鍖呬笂浼犵殑榪囩▼浠ュ強HDFS絀洪棿鍗犵敤銆?/p>
鎴戝湪嫻嬭瘯榪囩▼涓亣鍒頒簡綾諱技濡備笅鐨勯敊璇細
Application application_xxxxxxxxx_yyyy failed 2 times due to AM Container for application_xxxxxxxxx_yyyy
exited with exitCode: -1000 due to: java.io.FileNotFoundException: File /tmp/hadoop-spark/nm-local-dir/filecache does not exist
鍦?tmp/hadoop-spark/nm-local-dir璺緞涓嬪垱寤篺ilecache鏂囦歡澶瑰嵆鍙В鍐蟲姤閿欓棶棰樸?/p>