TheDeveloperBlog.com

Home | Contact Us

C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML

C# Word Interop: Microsoft.Office.Interop.Word

This C# tutorial uses Microsoft.Office.Interop.Word. It opens a DOC file and reads the text in it.

Microsoft Word can be used with C# code.

You have a Microsoft Word document (.doc) and want to read it in your C# program. With the Microsoft.Office.Interop.Word assembly, we get the contents and formatting from the document.

Tip: Add the Microsoft.Office.Interop.Word assembly to your project. Go to Project -> Add Reference...

Example. First, we show the file we will read. It contains three paragraphs containing one word each. The program first instantiates an Application instance and then we call Documents.Open on that variable.

Next: We loop through the Words collection and read the Text property on each element. We then display and call Quit.

Word document: word.doc

One

Two

three

C# program that uses Microsoft Word interop

using System;
using Microsoft.Office.Interop.Word;

class Program
{
    static void Main()
    {
	// Open a doc file.
	Application application = new Application();
	Document document = application.Documents.Open("C:\\word.doc");

	// Loop through all words in the document.
	int count = document.Words.Count;
	for (int i = 1; i <= count; i++)
	{
	    // Write the word.
	    string text = document.Words[i].Text;
	    Console.WriteLine("Word {0} = {1}", i, text);
	}
	// Close word.
	application.Quit();
    }
}

Output

Word 1 = One
Word 2 =
Word 3 = Two
Word 4 =
Word 5 = three
Word 6 =

Empty paragraphs. We see that Word 2, Word 4, and Word 6 are empty. The empty paragraphs in the input file are considered words. If you have multiple words in a paragraph, they will each be separate in the Words collection.

So: In Interop.Word, a paragraph is made up of a collection of one or more words.

Quit. Why is the application.Quit statement important? If you don't include this, the WINWORD.EXE application will remain in the process list. Then, when this program is run again, a new one will be started. This wastes memory.

Note: It is important to iterate on Words from 1 to Count inclusive. Correction suggested by Robert Ford.

Summary. We looked at the Microsoft.Office Interop.Word assembly and learned how to read in data from a Word document. This can be useful when you have DOC or DOCX files and want to programmatically read in data from your C# program.


Related Links

Adjectives Ado Ai Android Angular Antonyms Apache Articles Asp Autocad Automata Aws Azure Basic Binary Bitcoin Blockchain C Cassandra Change Coa Computer Control Cpp Create Creating C-Sharp Cyber Daa Data Dbms Deletion Devops Difference Discrete Es6 Ethical Examples Features Firebase Flutter Fs Git Go Hbase History Hive Hiveql How Html Idioms Insertion Installing Ios Java Joomla Js Kafka Kali Laravel Logical Machine Matlab Matrix Mongodb Mysql One Opencv Oracle Ordering Os Pandas Php Pig Pl Postgresql Powershell Prepositions Program Python React Ruby Scala Selecting Selenium Sentence Seo Sharepoint Software Spellings Spotting Spring Sql Sqlite Sqoop Svn Swift Synonyms Talend Testng Types Uml Unity Vbnet Verbal Webdriver What Wpf