系统:ubuntu12.04
软件:hadoop 0.23.4
机器:Thinkpad T420
安装前环境的准备:
首先就是确保您的机器上已经安装maven,因为hadoop0.23.0之后的版本都是用maven来编译和管理的!至于maven的安装,大家可以直接在命令行:sudo apt-get install maven2,如果嫌版本太陈旧,可以去官网下载最新的版本去安装,配置。这里不再赘述!
下面正式开始编译源代码:
(1) 到源代码目录下面:
tar -zxf hadoop-0.23.4-src.tar.gz
(2)到源码中:
cd hadoop-0.23.4-src
(3)编译源代码:
mvn package -Pdist -DskipTests -Dtar
编译结束后,会生成hadoop-0.23.4-src/hadoop-dist/target/hadoop-0.23.4.tar.gz这个就是我们自己编译出来的二进制文件,假如我们对于源码没有改动的话,这跟我们在官网上下载的二进制代码是没有任何区别的,但是如果我们想要改动hadoop的源码的话,我们就必须要自己编译自己的源码了!
(4)hadoop-0.23.4的安装和配置
前面的几步跟之前版本配置是一样的,我这里我就把之前文章中的步骤直接复制了过来:
第一步:准备一台机器(貌似这个有点多余
)我的是ubuntu11.10第二步:下载一个hadoop的版本,我用的是hadoop0.21.0
第三步:安装java的运行环境:
点击(此处)折叠或打开
- sudo apt-get install openjdk-6-jdk
点击(此处)折叠或打开
- groupadd hadoop
- useradd -g hadoop hadoop
- passwd hadoop
点击(此处)折叠或打开
- # su - hadoop
- $ ssh-keygen -t rsa -P ""
- $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
点击(此处)折叠或打开
- # cd /opt
- # tar xzf hadoop-0.21.0.tar.gz
- # ln -s hadoop-0.21.0 hadoop
- # chown -R hadoop:hadoop hadoop
su - hadoop
vi .bashrc
在文件的开头加入以下配置:
点击(此处)折叠或打开
- export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk
- export HADOOP_HOME=/opt/hadoop
- export PATH=$PATH:$HADOOP_HOME/bin
- export HADOOP_COMMON_HOME=$HADOOP_HOME
- export HADOOP_HDFS_HOME=$HADOOP_HOME
- export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
- export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
1、cd /opt/hadoop/etc/hadoop
2、vi yarn-env.sh
点击(此处)折叠或打开
- export HADOOP_FREFIX=/opt/hadoop
- export HADOOP_COMMON_HOME=${HADOOP_FREFIX}
- export HADOOP_HDFS_HOME=${HADOOP_FREFIX}
- export PATH=$PATH:$HADOOP_FREFIX/bin
- export PATH=$PATH:$HADOOP_FREFIX/sbin
- export HADOOP_MAPRED_HOME=${HADOOP_FREFIX}
- export YARN_HOME=${HADOOP_FREFIX}
- export HADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
- export YARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
点击(此处)折叠或打开
- <configuration>
- <property>
- <name>fs.defaultFS</name>
- <value>hdfs://localhost:54310/</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/opt/hadoop/hadoop-root</value>
- </property>
- <property>
- <name>fs.arionfs.impl</name>
- <value>org.apache.hadoop.fs.pvfs2.Pvfs2FileSystem</value>
- <description>The FileSystem for arionfs.</description>
- </property>
- </configuration>
点击(此处)折叠或打开
- <property>
- <name>dfs.namenode.name.dir</name>
- <value>file:/opt/hadoop/workspace/hadoop_space/dfs/name</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.namenode.data.dir</name>
- <value>file:/opt/hadoop/workspace/hadoop_space/dfs/data</value>
- <final>true</final>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>1</value>
- </property>
- <property>
- <name>dfs.permission</name>
- <value>false</value>
- </property>
点击(此处)折叠或打开
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>mapreduce.job.tracker</name>
- <value>hdfs://localhost:9001</value>
- <final>true</final>
- </property>
- <property>
- <name>mapreduce.map.memory.mb</name>
- <value>1536</value>
- </property>
- <property>
- <name>mapreduce.map.java.opts</name>
- <value>-Xmx1024M</value>
- </property>
- <property>
- <name>mapreduce.reduce.memory.mb</name>
- <value>3072</value>
- </property>
- <property>
- <name>mapreduce.reduce.java.opts</name>
- <value>-Xmx2560M</value>
- </property>
- <property>
- <name>mapreduce.task.io.sort.mb</name>
- <value>512</value>
- </property>
- <property>
- <name>mapreduce.task.io.sort.factor</name>
- <value>100</value>
- </property>
- <property>
- <name>mapreduce.reduce.shuffle.parallelcopies</name>
- <value>50</value>
- </property>
- <property>
- <name>mapreduce.system.dir</name>
- <value>file:/opt/hadoop/workspce/hadoop_space/mapred/system</value>
- </property>
- <property>
- <name>mapreduce.local.dir</name>
- <value>file:/opt/hadoop/workspce/hadoop_space/mapred/local</value>
- <final>true</final>
- </property>
点击(此处)折叠或打开
- <property>
- <name>yarn.nodemanager.aux-services</name>
- <value>mapreduce.shuffle</value>
- </property>
- <property>
- <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
- <value>org.apache.hadoop.mapred.ShuffleHandler</value>
- </property>
- <property>
- <name>mapreduce.framework.name</name>
- <value>yarn</value>
- </property>
- <property>
- <name>user.name</name>
- <value>hadoop</value>
- </property>
- <property>
- <name>yarn.resourcemanager.address</name>
- <value>localhost:54311</value>
- </property>
- <property>
- <name>yarn.resourcemanager.scheduler.address</name>
- <value>localhost:54312</value>
- </property>
- <property>
- <name>yarn.resourcemanager.webapp.address</name>
- <value>localhost:54313</value>
- </property>
- <property>
- <name>yarn.resourcemanager.resource-tracker.address</name>
- <value>localhost:54314</value>
- </property>
- <property>
- <name>yarn.web-proxy.address</name>
- <value>localhost:54315</value>
- </property>
- <property>
- <name>mapred.job.tracker</name>
- <value>localhost</value>
- </property>
./start-dfs.sh
./start-yarn.sh
就行了!!!