本文將介紹在Linux(Red Hat 9)環境下搭建Hadoop集群,此Hadoop集群主要由三臺機器組成,主機名分別為:
linux????? 192.168.35.101
linux02? 192.168.35.102
linux03? 192.168.35.103
從map reduce計算的角度講,linux作為master節點,linux02和linux03作為slave節點。
從hdfs數據存儲角度講,linux作為namenode節點,linux02和linux03作為datanode節點。
一臺namenode機,主機名為linux,hosts文件內容如下:
127.0.0.1?? ??? linux????????? localhost.localdomain????????? localhost
192.168.35.101???? linux????????? linux.localdomain????????????? linux
192.168.35.102???? linux02
192.168.35.103???? linux03
兩臺datanode機,主機名為linux02和linux03
>linux02的hosts文件
127.0.0.1 ??? ??? linux02?????? localhost.localdomain?????? localhost
192.168.35.102???? linux02?????? linux02.localdomain???????? linux02
192.168.35.101???? linux
192.168.35.103???? linux03
>inux03的hosts文件
127.0.0.1?? ????? ??? ?linux03????????? localhost.localdomain????????? localhost
192.168.35.103????????? linux03??????????? linux03.localdomain??????????? linux03
192.168.35.101? ??? ?linux
192.168.35.102? ??? ?linux02
1.安裝JDK
> 從java.cun.com下載jdk-6u7-linux-i586.bin
> ftp上傳jdk到linux的root目錄下
> 進入root目錄,先后執行命令
chmod 755 jdk-6u18-linux-i586-rpm.bin
./jdk-6u18-linux-i586-rpm.bin
一路按提示下去就會安裝成功
> 配置環境變量
cd進入/etc目錄,vi編輯profile文件,將下面的內容追加到文件末尾
export JAVA_HOME=/usr/java/jdk1.6.0_18
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
注意:三臺機器都要安裝JDK~
2.設置Master/Slave機器之間可以通過SSH無密鑰互相訪問
最好三臺機器的使用相同的賬戶名,我是直接使用的root賬戶
操作namenode機linux:
以用戶root登錄linux,在/root目錄下執行下述命令:
ssh-keygen -t rsa
一路回車下去即可在目錄/root/.ssh/下建立兩個文件id_rsa.pub和id_rsa。
接下來,需要進入/root/.ssh目錄,執行如下命令:
cd .ssh
再把is_rsa.pub文件復制到linux02和linux03機器上去。
scp -r id_rsa.pub root@192.168.35.102:/root/.ssh/authorized_keys_01
scp -r id_rsa.pub root@192.168.35.103:/root/.ssh/authorized_keys_01
操作datanode機linux02:
以用戶root登錄linux02,在目錄下執行命令:
ssh-keygen -t rsa
一路回車下去即可在目錄/root/.ssh/下建立兩個文件 id_rsa.pub和id_rsa。
接下來,需要進入/root/.ssh目錄,執行如下命令:
cd .ssh
再把is_rsa.pub文件復制到namenode機linux上去。
scp -r id_rsa.pub root@192.168.35.101:/root/.ssh/authorized_keys_02
操作datanode機linux03:
以用戶root登錄linux03,在目錄下執行命令:
ssh-keygen -t rsa
一路回車下去即可在目錄/root/.ssh/下建立兩個文件 id_rsa.pub和id_rsa。
接下來,需要進入/root/.ssh目錄,執行如下命令:
cd .ssh
再把is_rsa.pub文件復制到namenode機linux上去。
scp -r id_rsa.pub root@192.168.35.101:/root/.ssh/authorized_keys_03
*******************************************************************************
上述方式分別為linux\linux02\linux03機器生成了rsa密鑰,并且把linux的id_rsa.pub復制到linux02\linux03上去了,而把linux02和linux03上的id_rsa.pub復制到linux上去了。
接下來還要完成如下步驟:
linux機:
以root用戶登錄linux,并且進入目錄/root/.ssh下,執行如下命令:
cat id_rsa.pub >> authorized_keys
cat authorized_keys_02 >> authorized_keys
cat authorized_keys_03 >> authorized_keys
chmod 644 authorized_keys
linux02機:
以root用戶登錄linux02,并且進入目錄/root/.ssh下,執行如下命令:
cat id_rsa.pub >> authorized_keys
cat authorized_keys_01 >> authorized_keys
chmod 644 authorized_keys
linux03機:
以root用戶登錄linux03,并且進入目錄/root/.ssh下,執行如下命令:
cat id_rsa.pub >> authorized_keys
cat authorized_keys_01 >> authorized_keys
chmod 644 authorized_keys
通過上述配置,現在以用戶root登錄linux機,既可以無密鑰認證方式訪問linux02和linux03了,同樣也可以在linux02和linux03上以ssh linux方式連接到linux上進行訪問了。
3.安裝和配置Hadoop
> 在namenode機器即linux機上安裝hadoop
我下載的是hadoop-0.20.2.tar.gz,ftp上傳到linux機的/root目錄上,解壓到安裝目錄/usr/hadoop,最終hadoop的根目錄是/usr/hadoop/hadoop-0.20.2/
編輯/etc/profile文件,在文件尾部追加如下內容:
export HADOOP_HOME=/usr/hadoop/hadoop-0.20.2
export PATH=$HADOOP_HOME/bin:$PATH
> 配置Hadoop
core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
?? ?<property>
??????????????? <name>fs.default.name</name>
??????????????? <value>hdfs://192.168.35.101:9000</value>
??????? </property>
??????? <property>
??????????????? <name>hadoop.tmp.dir</name>
??????????????? <value>/tmp/hadoop/hadoop-${user.name}</value>
??????? </property>
</configuration>
hdfs-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
??????? <property>
??????????????? <name>dfs.name.dir</name>
??????????????? <value>/home/hadoop/name</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.data.dir</name>
??????????????? <value>/home/hadoop/data</value>
??????? </property>
??????? <property>
??????????????? <name>dfs.replication</name>
??????????????? <value>2</value>
??????? </property>
</configuration>
mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
??????? <property>
??????????????? <name>mapred.job.tracker</name>
??????????????? <value>192.168.35.101:9001</value>
??????? </property>
</configuration>
masters
192.168.35.101
slaves
192.168.35.102
192.168.35.103
至此,hadoop的簡單配置已經完成
> 將在namenode機器上配置好的hadoop部署到datanode機器上
這里使用scp命令進行遠程傳輸,先后執行命令
scp -r /usr/hadoop/hadoop-0.20.2 root@192.168.35.102:/usr/hadoop/
scp -r /usr/hadoop/hadoop-0.20.2 root@192.168.35.103:/usr/hadoop/
4.測試
以root用戶登入namenode機linux,進入目錄/usr/hadoop/hadoop-0.20.2/
cd /usr/hadoop/hadoop-0.20.2
> 執行格式化
[root@linux hadoop-0.20.2]# bin/hadoop namenode -format
11/07/26 21:16:03 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:?? host = linux/127.0.0.1
STARTUP_MSG:?? args = [-format]
STARTUP_MSG:?? version = 0.20.2
STARTUP_MSG:?? build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
Re-format filesystem in /home/hadoop/name ? (Y or N) Y
11/07/26 21:16:07 INFO namenode.FSNamesystem: fsOwner=root,root,bin,daemon,sys,adm,disk,wheel
11/07/26 21:16:07 INFO namenode.FSNamesystem: supergroup=supergroup
11/07/26 21:16:07 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/07/26 21:16:07 INFO common.Storage: Image file of size 94 saved in 0 seconds.
11/07/26 21:16:07 INFO common.Storage: Storage directory /home/hadoop/name has been successfully formatted.
11/07/26 21:16:07 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at linux/127.0.0.1
************************************************************/
> 啟動hadoop
[root@linux hadoop-0.20.2]# bin/start-all.sh
starting namenode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-linux.out
192.168.35.102: starting datanode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-linux02.out
192.168.35.103: starting datanode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-linux03.out
192.168.35.101: starting secondarynamenode, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-linux.out
starting jobtracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-linux.out
192.168.35.103: starting tasktracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-linux03.out
192.168.35.102: starting tasktracker, logging to /usr/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-linux02.out
[root@linux hadoop-0.20.2]#
> 用jps命令查看進程
[root@linux hadoop-0.20.2]# jps
7118 SecondaryNameNode
7343 Jps
6955 NameNode
7204 JobTracker
[root@linux hadoop-0.20.2]#
posted on 2011-08-25 16:01
jadmin 閱讀(126)
評論(0) 編輯 收藏