TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

<< Back to PYTHON

Python Word Count Method (re.findall)

Count the number of words in a string. A regular expression method, wordcount is introduced.
Word Count. How many words are in a string? Here we develop a Python method, wordcount, that uses re.findall to count words. It locates and counts non-whitespace characters with a special pattern.
Example. The re.findall method is the most important part of this solution. It does not simply find a match. It finds all matches within a string. When we count them, we can count matching patterns.

Pattern: We specify the pattern \S+ in the re.findall method. This means "one or more non-whitespace characters."

Len: We use the len() built-in to count the number of elements in the resulting list. This equals the number of words in the input string.

Python program that counts words import re def wordcount(value): # Find all non-whitespace patterns. list = re.findall("(\S+)", value) # Return length of resulting list. return len(list) value = "To be or not to be, that is the question." print(wordcount(value)) value = "Stately, plump Buck Mulligan came from the stairhead" print(wordcount(value)) value = "" print(wordcount(value)) Output 10 8 0
I verified that the method counted correctly the number of words in both (trivial) examples. On more complex samples, such as ones involving markup, results may be less accurate. Note how the second phrase has no trailing punctuation.

So: The example method does not count "word endings" but rather the words themselves.

Summary. The regular expression based method for counting words does not exactly mirror all word counting implementations. Microsoft Word, for example, uses a slightly different algorithm. But this version is often within 0.05% of its results.
© TheDeveloperBlog.com
The Dev Codes

Related Links:


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf