TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Apache Spark Word Count Example

Apache Spark Word Count Example with Spark Tutorial, Introduction, Installation, Spark Architecture, Spark Components, Spark RDD, Spark RDD Operations, RDD Persistence, RDD Shared Variables, etc.

<< Back to APACHE

Spark Word Count Example

In Spark word count example, we find out the frequency of each word exists in a particular file. Here, we use Scala language to perform Spark operations.

Steps to execute Spark word count example

In this example, we find and display the number of occurrences of each word.

  • Create a text file in your local machine and write some text into it.
$ nano sparkdata.txt 
Spark Word Count Example
  • Check the text written in the sparkdata.txt file.
$ cat sparkdata.txt
Spark Word Count Example
  • Create a directory in HDFS, where to kept text file.
$ hdfs dfs -mkdir /spark
  • Upload the sparkdata.txt file on HDFS in the specific directory.
$ hdfs dfs -put /home/codegyani/sparkdata.txt /spark
Spark Word Count Example
  • Now, follow the below command to open the spark in Scala mode.
$ spark-shell
Spark Word Count Example
  • Let's create an RDD by using the following command.
scala> val data=sc.textFile("sparkdata.txt")

Here, pass any file name that contains the data.

  • Now, we can read the generated result by using the following command.
scala> data.collect;
Spark Word Count Example
  • Here, we split the existing data in the form of individual words by using the following command.
scala> val splitdata = data.flatMap(line => line.split(" "));
  • Now, we can read the generated result by using the following command.
scala> splitdata.collect;
Spark Word Count Example
  • Now, perform the map operation.
scala> val mapdata = splitdata.map(word => (word,1));

Here, we are assigning a value 1 to each word.

  • Now, we can read the generated result by using the following command.
scala> mapdata.collect;
Spark Word Count Example
  • Now, perform the reduce operation
scala> val reducedata = mapdata.reduceByKey(_+_);

Here, we are summarizing the generated data.

  • Now, we can read the generated result by using the following command.
scala> reducedata.collect;
Spark Word Count Example

Here, we got the desired output.






Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf