TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

Program to Find the Most Repeated Word in a Text File

Program to Find the Most Repeated Word in a Text File on fibonacci, factorial, prime, armstrong, swap, reverse, search, sort, stack, queue, array, linkedlist, tree, graph, pattern, string etc.

<< Back to PROGRAM

Program to find the most repeated word in a text file

Explanation

In this program, we need to find the most repeated word present in given text file. This can be done by opening a file in read mode using file pointer. Read the file line by line. Split a line at a time and store in an array. Iterate through the array and find the frequency of each word and compare the frequency with maxcount. If frequency is greater than maxcount then store the frequency in maxcount and corresponding word that in variable word. The content of data.txt file used in the program is shown below.

data.txt

A computer program is a collection of instructions that performs specific task when executed by a computer.

Computer requires programs to function.

Computer program is usually written by a computer programmer in programming language.

A collection of computer programs, libraries, and related data are referred to as software.

Computer programs may be categorized along functional lines, such as application software and system software.

Algorithm

  1. Variable maxCount will store the count of most repeated word.
  2. Open a file in read mode using file pointer.
  3. Read a line from file. Convert each line into lowercase and remove the punctuation marks.
  4. Split the line into words and store it in an array.
  5. Use two loops to iterate through the array. Outer loop will select a word which needs to be count. Inner loop will match the selected word with rest of the array. If match found, increment count by 1.
  6. If count is greater than maxCount then, store value of count in maxCount and corresponding word in variable word.
  7. At the end, maxCount will hold the maximum count and variable word will hold most repeated word.

Solution

Python

count = 0;
word = "";
maxCount = 0;
words = [];
 
#Opens a file in read mode
file = open("data.txt", "r")
    
#Gets each line till end of file is reached
for line in file:
    #Splits each line into words
    string = line.lower().replace(',','').replace('.','').split(" ");
    #Adding all words generated in previous step into words
    for s in string:
        words.append(s);
 
#Determine the most repeated word in a file
for i in range(0, len(words)):
    count = 1;
    #Count each word in the file and store it in variable count
    for j in range(i+1, len(words)):
        if(words[i] == words[j]):
            count = count + 1;
            
    #If maxCount is less than count then store value of count in maxCount
    #and corresponding word to variable word
    if(count > maxCount):
        maxCount = count;
        word = words[i];
        
print("Most repeated word: " + word);
file.close();

Output:

 Most repeated word: computer

C

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
 
int main()
{   
    FILE *file;
    char ch, *line;
    size_t len = 0, read;
    char words[1000][1000], word[20];
    int i = 0, j, k, maxCount = 0, count;
    
    //Opens file in read mode
    file = fopen("data.txt","r");
    
    //If file doesn't exist
    if (file == NULL){
        printf("File not found");
        exit(EXIT_FAILURE);
    }
    
    //Since, C doesn't provide in-built function, 
    //following code will split content of file into words
    while ((read = getline(&line, &len, file)) != -1) {
        
        for(k=0; line[k]!='\0'; k++){
            //Here, i represents row and j represents column of two-dimensional array words 
            if(line[k] != ' ' && line[k] != '\n' && line[k] != ',' && line[k] != '.' ){
                words[i][j++] = tolower(line[k]);
            }
            else{
                words[i][j] = '\0';
                //Increment row count to store new word
                i++;
                //Set column count to 0
                j = 0;
            }
        }
    }
    
    int length = i;
    
    //Determine the most repeated word in a file
    for(i = 0; i < length; i++){
        count = 1;
        //Count each word in the file and store it in variable count
        for(j = i+1; j < length; j++){
            if(strcmp(words[i], words[j]) == 0 && (strcmp(words[i]," ") != 0)){
                count++;
            } 
        }
        //If maxCount is less than count then store value of count in maxCount 
        //and corresponding word to variable word
        if(count > maxCount){
            maxCount = count;
            strcpy(word, words[i]);
        }
    }
    
    printf("Most repeated word: %s", word);
    fclose(file);
    
    return 0;
}

Output:

Most repeated word: computer

JAVA

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
 
public class MostRepeatedWord {
    
    public static void main(String[] args) throws Exception {
        String line, word = "";
        int count = 0, maxCount = 0;
        ArrayList<String> words = new ArrayList<String>();
        
        //Opens file in read mode
        FileReader file = new FileReader("data.txt");
        BufferedReader br = new BufferedReader(file);
        
        //Reads each line
        while((line = br.readLine()) != null) {
            String string[] = line.toLowerCase().split("([,.\\s]+)");
            //Adding all words generated in previous step into words
            for(String s : string){
                words.add(s);
            }
        }
        
        //Determine the most repeated word in a file
        for(int i = 0; i < words.size(); i++){
            count = 1;
            //Count each word in the file and store it in variable count
            for(int j = i+1; j < words.size(); j++){
                if(words.get(i).equals(words.get(j))){
                    count++;
                } 
            }
            //If maxCount is less than count then store value of count in maxCount 
            //and corresponding word to variable word
            if(count > maxCount){
                maxCount = count;
                word = words.get(i);
            }
        }
        
        System.out.println("Most repeated word: " + word);
        br.close();
    }
}

Output:

Most repeated word: computer

C#

using System;
using System.Collections;
 
public class MostRepeatedWord
{    
    public static void Main()
    {
        String line, word = "";
        int count = 0, maxCount = 0;
        ArrayList words = new ArrayList();
        
        //Opens file in read mode
        System.IO.StreamReader file = new System.IO.StreamReader(@"data.txt"); 
        
        //Reads each line
        while((line = file.ReadLine()) != null){
            String[] string1 = line.ToLower().Split(new Char [] {',' , '.',' '},StringSplitOptions.RemoveEmptyEntries);
            //Adding all words generated in previous step into words
            foreach(String s in string1){
                words.Add(s);
            }
        }
        
        //Determine the most repeated word in a file
        for(int i = 0; i < words.Count; i++){
            count = 1;
            //Count each word in the file and store it in variable count
            for(int j = i+1; j < words.Count; j++){
                if(words[i].Equals(words[j])){
                    count++;
                } 
            }
            //If maxCount is less than count then store value of count in maxCount 
            //and corresponding word to variable word
            if(count > maxCount){
                maxCount = count;
                word = (String) words[i];
            }
        }
        
        Console.WriteLine("Most repeated word: " + word);
        file.Close();
    }
}

Output:

Most repeated word: computer

PHP

<!DOCTYPE html>
<html>
<body>
<?php
$word = "";
$count = $maxCount = 0;
$words = array();
 
//Opens file in read mode
$file = fopen("data.txt", "r");
 
//Reads each line
while (($line = fgets($file)) !== false) {
    $line = strtolower($line);
    $line = str_replace(',' ,'', $line);
    $line = str_replace('.', '', $line);
    $string = explode(' ', $line);
    
    //Adding all words generated in previous step into words
    for($i = 0; $i < count($string); $i++){
        array_push($words, $string[$i]);
    }
}
 
//Determine the most repeated word in a file
for($i = 0; $i < count($words); $i++){
    $count = 1;
    //Count each word in the file and store it in variable count
    for($j = $i+1; $j < count($words); $j++){
        if($words[$i] == $words[$j]){
            $count++;
        } 
    }
    //If maxCount is less than count then store value of count in maxCount 
    //and corresponding word to variable word
    if($count > $maxCount){
        $maxCount = $count;
        $word = $words[$i];
    }
}
 
print("Most repeated word: " . $word);
fclose($file);
?>
</body>
</html>

Output:

Most repeated word: computer

Next Topic#




Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf