Article Updated: 2020-04-03
By convention, attach a file link on the text of the first.
File Name: apache-maven-3.6.3-bin.tar.gz
File size: 9.1 MB
Download Link: https://www.lanzous.com/iaykx6d
SHA256: 26AD91D751B3A9A53087AEFA743F4E16A17741D3915B219CF74112BF87A438C5
Click on the directory jump
First, install maven
1, download maven
- You can Quguan network https://maven.apache.org/download.cgi#Files download.
- You can also use blue text of the first chain-outs provided (note that check Hash).
2, extract the installation
# 解压到 /usr/local
sudo tar -zxvf apache-maven-3.6.3-bin.tar.gz -C /usr/local | tail -n 10
# 改名和修改权限
cd /usr/local/
sudo mv apache-maven-3.6.3/ maven
sudo chown -R bigdata:bigdata maven
3, run the example
- Now the user's home directory create a directory tree
~/sparkapp2/src/main/java
cd ~
mkdir -p sparkapp2/src/main/java
- Then create files in this directory tree
SimpleApp.java
vim ~/sparkapp2/src/main/java/SimpleApp.java
# 文件内容如下:
/*** SimpleApp.java ***/
import org.apache.spark.api.java.*;
import org.apache.spark.api.java.function.Function;
public class SimpleApp {
public static void main(String[] args) {
String logFile = "file:///usr/local/spark/README.md"; // Should be some file on your system
JavaSparkContext sc = new JavaSparkContext("local", "Simple App",
"file:///usr/local/spark/", new String[]{"target/simple-project-1.0.jar"});
JavaRDD<String> logData = sc.textFile(logFile).cache();
long numAs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("a"); }
}).count();
long numBs = logData.filter(new Function<String, Boolean>() {
public Boolean call(String s) { return s.contains("b"); }
}).count();
System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
}
}
- Then create a declaration file (declaration information and dependencies of the stand-alone application and the Spark)
vim ~/sparkapp2/pom.xml
# 文件内容如下:
<project>
<groupId>edu.berkeley</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<repositories>
<repository>
<id>Akka repository</id>
<url>http://repo.akka.io/releases</url>
</repository>
</repositories>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>
</project>
- Finally, you can use the
find
command to check the file structure, there is no problem if you can usemaven
packaged.
find ~/sparkapp2/
# 执行打包命令
cd sparkapp2/
/usr/local/maven/bin/mvn package
Run shot as follows:
- When you see the green
SUCCESS
represents the success of the package, as shown below:
Note: The above is the subject of one hour, but I feel packed more than one hour long wait is always ~
- Submission program run
/usr/local/spark/bin/spark-submit --class "SimpleApp" ~/sparkapp2/target/simple-project-1.0.jar
#上面命令执行后会输出太多信息,可以不使用上面命令,而使用下面命令查看想要的结果
/usr/local/spark/bin/spark-submit --class "SimpleApp" ~/sparkapp2/target/simple-project-1.0.jar 2>&1 | grep "Lines with a"
# 参数解读如下:
./bin/spark-submit
--class <main-class> //需要运行的程序的主类,应用程序的入口点
--master <master-url> //Master URL,下面会有具体解释
--deploy-mode <deploy-mode> //部署模式
... # other options //其他参数
<application-jar> //应用程序JAR包
[application-arguments] //传递给主类的主方法的参数
Screenshot run down here first command:
Note: The specific see Professor Lin Ziyu blog: Getting Spark2.1.0: Spark of installation and use