Background

Back in the late 80’s early 90’s I wrote some software called Rendezvous and sold it as shareware. I wrote it in Clipper (a dBase III compiler). It’s purpose was generally for lead generation and outbound telemarketing. Telemarketing using databases was relatively new at the time. For general lead processing one of the better strategies was to call the prospective customer, follow up immediately with a brochure or letter and then call them back in few days hopefully before your letter got thrown in the bin. Being pre-Windows you had to construct your printout using printer control codes etc; a nightmare in the days of dot matrix, daisy wheel, line printers and soon laser and ink jet printers.

The dominant word processor at the time was WordPerfect; it had an incredibly powerful merge language and an ever increasing array of printer drivers, so armed with a WordPerfect SDK I entered the world of document generation, and I rewrote my Rendezvous program to work with WordPerfect. My connection with WordPerfect for document generation lasted for many years, but WordPerfect was a company that was slow to embrace Windows and staffed by engineers rather than marketers. Much to the disappointment of many people at the time, Microsoft Word soon came to dominate.

For authors of document generation systems the transition to Microsoft Word was a difficult time. With Word you had to live with a very very basic mail merge facility and some WordBasic. It wasn’t until VBA came along in 1997 that things improved slightly – but only slightly. The WordPerfect merge language was truly superior; you could do most anything with it albeit in a clumsy macro sort of way. But one of the strong features for document generation automation was that the logic for creating documents was embedded in the document template. Developers (or End Users) could create a data source (called a WordPerfect Secondary file, which with some tweaking could in include data from multiple tables), and could create templates called Primary files. Because of this Word’s popularity with users meant that no matter how much harder it was to automate the generation of complex documents, for everyday use it was so much easier than WordPerfect – Word hastened the demise of specialist word processing operator. The fact that users were no longer specialists, document generation was in large measure handed over to developers who controlled document automation from outside the document. Today document generation is mostly in the hands of large developer-centric organisations. This leaves the smaller developer somewhat in a quandary of how to generate documents from their custom software in a financially painless way for their clients.

For the Access developer where are we now.

Microsoft Access reports

In many cases a Microsoft Access Report is all that it needed, especially if there is never any need to manually edit a resultant document. It can be printed or saved to PDF.

Export to Word

An editable document can be created by exporting an Access report to Word. This can be done two ways:

Interactively

By selecting the appropriate action from the Ribbon.

This will create an RTF formatted report named the same as the report it was based on and saved in the user’s document folder

Using VBA code

Using the Northwind database as an example the code might be:

[sourcecode language=”VB” gutter=”false”] DoCmd.OpenReport "Invoice", acViewPreview, , _
"[Order ID]=" & Me.[Order ID] DoCmd.OutputTo acOutputReport, "Invoice", acFormatRTF, _
CurrentProject.Path & "\Inv" & Me.[Order_ID] & ".rtf", True
[/sourcecode]

This gives you control of the name of the document and its location, you can even force Word to accept it as a Word .doc file just by naming the output as .doc

[sourcecode language=”VB” gutter=”false”] DoCmd.OpenReport "Invoice", acViewPreview, , _
"[Order ID]=" & Me.[Order ID] DoCmd.OutputTo acOutputReport, "Invoice", acFormatRTF, _
CurrentProject.Path & "\Inv" & Me.[Order_ID] & ".doc", True
[/sourcecode]

My favourite though is to allow the user to choose the name and location by outputting to a .dot file in the user’s temporary folder.

[sourcecode language=”VB” gutter=”false”] DoCmd.OpenReport "Invoice", acViewPreview, , _
"[Order ID]=" & Me.[Order ID] DoCmd.OutputTo acOutputReport, "Invoice", acFormatRTF, _
Environ("temp") & "\Inv" & Me.[Order_ID] & ".dot", True
[/sourcecode]

No matter how you cut it though, there are problems with this. The RTF is an old Microsoft proprietary format which Microsoft abandoned in terms of development after 2008. It lacks styles, lists etc, so the output from Access is not at all MS Word-centric, in addition (or subtraction) the export from Access does not include the graphics and charts from the report, so even just logos on letters, invoices etc are a problem; and similarly subreports are not included so only the main report is exported. With the RTF export formatting often changes, especially at the end of the report.

Essentially data for exporting a report to a Word document needs to be flat, with one table, or one query providing the data. There can be many records on a page.

Word Mail Merge

Another alternative that is also relatively easy is to use Word mail merge, A merge template document is created and then merged with data which can be interactively directly from a table or a query.
Merge is very good for multiple form letters, address labels for envelopes or even email. The process and results are very different to Exporting to Word.

The steps for mail merge are:

  • Set up a data set, a table, a query, address list or a file CSV, Tab etc.
  • Create a form letter and connect the data to it
  • Add placeholders for the merge fields, and save the merge template
  • Connect the form letter to the data set (table, query, file)
  • Preview and perform the merge

The advantages of Mail merge over Exporting to Word include being able to use the full feature set of Word, styles, graphics (but not dynamic charts), and intelligent Word fields, conditional, including boilerplate assembly.

It is possible to use VBA and Word Automation to automate a mail merge such as:
[sourcecode language=”VB” gutter=”false”] Dim objWord As Word.Document
Set objWord = GetObject("C:\MyMerge.doc", "Word.Document")
objWord.Application.Visible = True
‘ Set the mail merge data source as the Northwind database.
objWord.MailMerge.OpenDataSource _
Name:="C:\Program Files\Microsoft " & _
"Office\Office\Samples\Northwind.mdb", _
LinkToSource:=True, _
Connection:="TABLE Customers", _
SQLStatement:="SELECT * FROM [Customers]"
‘ Execute the mail merge.
objWord.MailMerge.Execute
[/sourcecode]

Word Mail Merge is generally based on one record per document, so for example an Invoice like the one in Northwind would be very hard to create, and more complex structures impossible. But on the plus side you can make the documents attractive.

Populate book marks with data with VBA

Another way of creating documents is to use Word documents with Bookmarks. The basic idea with this is to create a Word document and insert named bookmarks where data needs to appear, and then using Word Automation to open the document, find the bookmarks and insert the required text at each bookmark. You can also insert other structures such as tables, so an Invoice like the one in the Northwind Invoice can be created.

An advantage of this is the full range of Word formatting features are available. Because the data is driven by programming code “pushing” data to the form, data can come from multiple data sets, eg invoice headers and invoice lines controlled by programming code. The disadvantage of this though is the amount of programming code needed to achieve a complex document and the general detachment of the template from the data can make the document more difficult to maintain. Often Add-Ins are used to make the insertion of bookmarks into the template easier for users (and developers). code, and where the users is not given so much say.

The basic VBA code for this is:
[sourcecode language=”VB” gutter=”false”] Dim objWord As Word.Document
Set objWord = GetObject("C:\MyTemplate.docx", "Word.Document")
objWord.Application.Visible = True
With objWord.ActiveDocument.Bookmarks
.Item(“Bookmarkname”).Range.Text = “Data to insert”
… etc etc
End With
[/sourcecode]

Custom Document Properties and Fields

Word comes with a number of predefined Document Properties such as Author, Pages, etc. From the File – Info – Properties – Advanced Properties, you can view the inbuilt properties and assign values to them, and you can create custom properties and assign the type and value to them and add them to the document.

The basic idea of using custom documents is that you create these properties and then using Insert QuickParts – Fields – DocProperty and choose the property you wish to insert.

From Access you would use something like the following code to populate the document with data.

[sourcecode language=”VB” gutter=”false”] Dim objWord As Word.Document
Set objWord = GetObject("C:\MyTemplate.docx", "Word.Document")
objWord.Application.Visible = True

With objWord.ActiveDocument.CustomDocumentProperties
Add Name:="CustomerNumber", _
LinkToContent:=False, _
Type:=msoPropertyTypeNumber, _
Value:=1000
End With
[/sourcecode]

An advantage of using document properties over bookmarks is that you can refer to a document property from many places in the document, so if data needs to repeated multiple time, you only have to update the property value once and everything update. With Bookmarks you have have a differently named bookmark for every location, and creating multiple record documents can be problematic

As in using Bookmarks, the management of this can get quite complex. Personally I do not like Access to do so much work other than just providing the data. To achieve this it is possible of course to pull the data from inside Word, and link to the data in Access (or other databases), and similarly populate bookmarks or document properties, however this generally means that the report selection is handled from with Word, however since my applications are usually driven from Access I usually prefer it the other way around. But this is an option.

Visual Studio and specialist third party .NET libraries

Products such as Aspose provide tools for building documents, and generally this is less of an option for Access developers.

Specialist document merging systems

Products such as HotDocs and Windward are probably the most likely ones than an Access developer is likely to use.

The disadvantage of these systems is the cost and potential complexity of integration into an Access application. I have not used HotDocs, but I have used Windward extensively, and Windward is my favoured program for document generation. However they are generally too expensive for simple projects. They both have an Add-In for Microsoft Word which appears on a Word Ribbon. Options from the Ribbon allow connection to a data source (files or various databases), which allows a document to be marked up using drag and drop, and that process is quite easy so end users can generally feel comfortable with them, but also can handle more complex operation. So creation of templates can be with the user, or by or in collaboration with the developer. The generation of documents can occur directly from within Word, but generally for anything serious they require a server component, which may be on a local or remote machine. I am familiar with Windward as it supports XML and allows XPath statements and functions for accessing data. Both products can use SQL statements when connected to SQL databases, and ODBC data sources.

HotDocs is more likely to be used in applications where there is more human interaction in the document assembly. This perhaps makes it a stronger contended for legal merging where there is considerable individuality about the documents but nonetheless based on a precedent document. HotDocs also works with WordPerfect which was my previous favourite document generating platform.

Windward is very good for complex documents and can include dynamic charts, and boilerplating of parts of the document.

Both products generate Word documents which are standalone, discrete from the original data they were based on..

The market for document generation is always widening, and a relatively new class of products CCM (Customer Communication Management) which often are more oriented to the generation of digital content, also can produce Word documents. Products of this type include OpenText, Ecrion, Kofax, and many more.

Microsoft Word 2003 XML Markup and XSLT

When Microsoft released Word 2003, they entered into a brand new era of using XML as the underlying structure for Word documents in place of the Microsoft proprietary binary file .DOC format. One of the reason for this was customer demand for long term storage of documents in non-proprietary formats so that content could be readily accessed in the future. Microsoft gave no real tools for creating documents using XML, but did allow you to attach XML tags to documents, the major purpose of this was to allow extraction of data from a document, but also (perhaps unintentionally) opened the possibility of merging data with documents using XSLT, which meant that Word itself is not needed to create the finally merged document, other than to open, view and print it. Microsoft Word was never good being used as a server application however XSLT was available both on client side and server side of all the major operating systems. Microsoft without much fanfare released a command line too called the Microsoft Office Word 2003 WordML XSLT Inference Tool, which generated am XSLT template from the XML marked up document. With this you could use almost any XSLT processor to merge XML data and the template to create a Word .DOCX file.

There were two problems with this.
The XSLT file generated from the Microsoft Office Word 2003 XSLT Inference Tool. could not itself be opened by Word, nor opened intelligently from any other program. If you needed to make even minor edits to the template you really had to go back to the original source document and start the transformation process from scratch. Often it was necessary to manually edit the XSLT to get the desired results, so having to start from scratch each time could be quite an impost. But the fact that Microsoft had committed to this technology, and had provided some tools was an encouraging thing. I used this quite extensively until an additional problem popped up

The first problem was an annoyance, but the second a killer. In May 2007 Microsoft was successfully sued for infringing a patent regarding XML markup by the company i4i and it founder Michel Vulpe. As part of the settlement in 2010 Microsoft released an update as part of the normal Windows and Office updates to remove the XML tagging in Office 2003, and 2007 and was precluded from using the technology in Office 2010. Although XML is open, specific implementation of it can be subject to copyright.

Once an XSLT template had been created using the Word 2003 XSLT Inference tool though, it was still possible to generate a new document based on it, because Word was not again involved until after the document was generated, and that document was no longer dependant on the template or the data source. As soon as the document was opened in Word, the markup itself was removed with a helpful message from Microsoft, but that was already too late, the document had been populated with the data and the XSLT did its job. But it meant that these template could no longer be created, nor old ones edited.

Because an XSLT processor and not Word was used for the “merging” of the XML data with the XSLT template, the process was very useful for server side generation. Word like the other Office programs, are not suitable for running as server processes.

Word Content Controls

In Word 2007, Microsoft introduced Content Controls. Content Controls are a little like Word Fields. The various templates supplied with Microsoft Word 2007, 2010 and 2013 include Content Controls which prompt the user to add information from the keyboard Content Controls can also be bound/mapped to XML element forming part of the document file structure. Microsoft changed the layout of the Word DOCX file, so that rather than being directly an XML file it is a package of components, packaged and compressed together. To look at the file structure of a Word document the easiest way is to rename the document extension to .ZIP, and then open it with an archive package like WinZip, 7-Zip, WinRAR. In the Word 2003 XML document format, the document was itself only an XML conforming to the WordML XML vocabulary, graphics etc formed part of the document’s XML data. With the later version graphic files are saved in the package file and referenced by the content XML. Similarly you can add an XML document to the package, and you can then bind this to Content Controls. In Word 2013 Microsoft did some tweaking to the Content Controls and now they are far more useful for document generation.

See other blog article Document Generation using Word Content Controls