歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
Linux教程網 >> Linux基礎 >> Linux技術 >> ubuntu15安裝spark1.6

ubuntu15安裝spark1.6

日期:2017/3/3 12:50:48   编辑:Linux技術
ubuntu安裝spark
1、安裝Ubuntu
2、設置root密碼sudo passwd root
[sudo] password for you :---> 輸入你的密碼,不會顯示
3、安裝vmtools 復制到桌面 提取出來 su 命令 ./vm...install...
4、系統設置-語言支持-檢查-更新
5、重啟
判斷Ubuntu是否安裝了ssh服務:
ps -e |grep ssh 如果服務已經啟動,則可以同時看到“ssh-agent”和“sshd”,否則表示沒有安裝服務,或沒有開機啟動
安裝ssh服務,輸入命令:sudo apt-get install openssh-server
啟動服務:#/etc/init.d/ssh start
sudo apt-get install rpm
安裝JDK--未安裝
rpm -qa | grep java
rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.50.1.11.5.el6_3.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.9-2.3.4.1.el6_3.x86_64
rpm -e --nodeps tzdata-java-2012j-1.el6.noarch
mkdir ~/modules
mkdir ~/tools
mkdir ~/software
解壓源碼包,輸入命令:
tar -zxvf jdk-7u79-linux-x64.tar.gz
mv jdk1.7.0_79 ~/modules/jdk1.7
配置環境變量命令:
sudo gedit ~/.bashrc
vi ~/.bashrc
##JAVA
export JAVA_HOME=/home/spark/modules/jdk1.7
export PATH=$PATH:$JAVA_HOME/bin
source ~/.bashrc
設置主機名 su
vi /etc/hosts 格式 ip aaa.bbb.ccc aaa
192.168.192.138 jarvan.dragon.org jarvan
永久生效
vi /etc/hostname
HOSTNAME=Jarvan
臨時生效
hostname Jarvan
安裝hadoop2.6
SSH本機免登陸密碼
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
tar -zxvf hadoop-2.6.0-x64.tar.gz -C ~/modules/
vi ~/.bashrc
vi編寫注意事項del刪除後按a才能輸入
##HADOOP
export HADOOP_HOME=/home/spark/modules/hadoop-2.6.0
export PATH=$PATH:/home/spark/modules/hadoop-2.6.0/sbin:/home/spark/modules/hadoop-2.6.0/bin
source ~/.bashrc
cd ~/modules/hadoop-2.6.0/etc/hadoop
slaves
jarvan.dragon.org
hadoop-env.sh
export JAVA_HOME=/home/spark/modules/jdk1.7/
yarn-env.sh
export JAVA_HOME=/home/spark/modules/jdk1.7/
mapred-env.sh
export JAVA_HOME=/home/spark/modules/jdk1.7/
core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/home/spark/tools/hadoopdata</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://jarvan.dragon.org:9000</value>
</property>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master.dragon.org</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
cp mapred-site.xml.template mapred-site.xml
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
格式化
bin/hdfs namenode -format
啟動 start-dfs.sh start-yarn.sh
安裝scala-安裝spark
tar -zxvf scala-2.10.5.tgz -C ~/modules/
tar -zxvf spark-1.6.0-bin-hadoop2.6.tgz -C ~/modules/
vi ~/.bashrc
export SCALA_HOME=/home/spark/modules/scala-2.10.5
export PATH=$PATH:$SCALA_HOME/bin
export SPARK_HOME=/home/spark/modules/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
source ~/.bashrc
Spark Standalone模式 cluster model
cp spark-env.sh.template spark-env.sh
conf/spark-env.sh 詳細說明看注釋
JAVA_HOME=/home/spark/modules/jdk1.7
SCALA_HOME=/home/spark/modules/scala-2.10.5
HADOOP_CONF_DIR=/home/spark/modules/hadoop-2.6.0/etc/hadoop
SPARK_MASTER_IP=jarvan.dragon.org
SPARK_MASTER_PORT=7077 #默認7077
SPARK_MASTER_WEBUI_PORT=8080 #默認8080
SPARK_WORKER_CORES=1
SPARK_WORKER_MEMORY=1000m
SPARK_WORKER_PORT=7078
SPARK_WORKER_WEBUI_PORT=8081
SPARK_WORKER_INSTANCES=1
cp slaves.template slaves
conf/slaves worker節點配置
jarvan.dragon.org
cp spark-defaults.conf.template spark-defaults.conf
conf/spark-defaults.conf master節點配置
spark.master spark://jarvan.dragon.org:7077
啟動Standalone模式
sbin/start-master.sh
sbin/start-slaves.sh
查看
http://192.168.192.138:8080/
打開Standalone模式的spark-shell
spark-shell --master spark://jarvan.dragon.org:7077 --executor-memory 300m
查看
http://192.168.192.138:4040/
測試
val num=sc.parallelize(1 to 10)
val rdd = num.map(x=>(x,1))
rdd.collect
rdd.saveAsTextFile("hdfs://192.168.192.138:9000/data/output1")
本地
spark-shell --master local[1]
Copyright © Linux教程網 All Rights Reserved