Install spark ref:
http://devopspy.com/python/apache-spark-pyspark-centos-rhel/
cd /opt
wget http://www-eu.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
ln -s spark-2.4.0-bin-hadoop2.7 spark
check /etc/hosts
How to set path?
export SPARK_HOME = /opt/spark
export PATH = $PATH:/opt/spark
export export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$SPARK_HOME/python/lib/pyspark.zip:$PYTHONPATH
export PATH = $SPARK_HOME/python:$PATH
How to start master?
./sbin/start-master.sh
1) If you get error like blow:
"hostname: Unknown host" start-master.sh
set the hostname properly
hostname test.com
hostname -f #should give you some output
2) If you get error like below:
Getting "Unsupported major.minor version 52.0" exception while using
Spark Web Application framework
check Java version of jar files (/opt/spark/jars) and your installed java
How to start spark master?
cd /opt/spark
./sbin/start-master.sh
This internally runs command like below:
Spark Command: /opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host test.com --port 7077 --webui-port 8080
How to access from web?
test.com:8080 (port: 8080)
How to start spark shell?
cd /opt/spark
./bin/pyspark
FYI, This internally runs command like below:
/opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit --name PySparkShell pyspark-shell
How to access the spark process in ps commands?
ps -ef | grep spark
e.g.,root 13770 1 0 14:18 pts/0 00:00:10 /opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host test.com --port 7077 --webui-port 8080
PIP modules to install
pip install py4j
How to access in web?
http://devopspy.com/python/apache-spark-pyspark-centos-rhel/
cd /opt
wget http://www-eu.apache.org/dist/spark/spark-2.2.1/spark-2.2.1-bin-hadoop2.7.tgz
ln -s spark-2.4.0-bin-hadoop2.7 spark
check /etc/hosts
How to set path?
export SPARK_HOME = /opt/spark
export PATH = $PATH:/opt/spark
export export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$SPARK_HOME/python/lib/pyspark.zip:$PYTHONPATH
export PATH = $SPARK_HOME/python:$PATH
How to start master?
./sbin/start-master.sh
1) If you get error like blow:
"hostname: Unknown host" start-master.sh
set the hostname properly
hostname test.com
hostname -f #should give you some output
2) If you get error like below:
Getting "Unsupported major.minor version 52.0" exception while using
Spark Web Application framework
check Java version of jar files (/opt/spark/jars) and your installed java
How to start spark master?
cd /opt/spark
./sbin/start-master.sh
This internally runs command like below:
Spark Command: /opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host test.com --port 7077 --webui-port 8080
How to access from web?
test.com:8080 (port: 8080)
How to start spark shell?
cd /opt/spark
./bin/pyspark
FYI, This internally runs command like below:
/opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.SparkSubmit --name PySparkShell pyspark-shell
How to access the spark process in ps commands?
ps -ef | grep spark
e.g.,root 13770 1 0 14:18 pts/0 00:00:10 /opt/java/jdk1.8.0_201/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host test.com --port 7077 --webui-port 8080
PIP modules to install
pip install py4j
How to access in web?
http://localhost:8080
No comments:
Post a Comment