Export XML Data from Microsoft Access – Part 1 – Tricks and Traps
Ever since Microsoft Access 2003, Access has had the inbuilt ability to easily export data as XML, and the schema of the underlying table or query to an XSD..
Even before this it was possible to write XML from Access by using VBA to write it out as a text file, or by setting a reference to the MSXML or MSXML2 library and build the document using the DOM (Document Object Model). This gives you full control, but also a lot of code to write. This article is not going to discuss this method, but rather the inbuilt ExportXML method.
The output format of the ExportXML is quite rigid..
- The data in the XML file is only contained in elements and not in attributes.
- The element names are derived solely from the field names of the underlying table or query
- There is no support for CDATA sections.
- You cannot manage custom namespaces
- the node names are inflexible, they are named on the Escaped name of the table or query
- the root of the document is “dataroot”
Matching an existing schema is usually not viable. If you cannot work within these restrictions then you are better off using the MSXML2 library. For your own custom applications though this is less likely to be problem; and if it is a problem you can possibly write an XSLT to transform it to something you need. Apart from these there are still more traps awaiting, so this article will try to show you how to avoid some of these traps.
In the first part of this series I will create XML using the Microsoft Access interactive tools. In the second part I will use VBA to achieve the same plus a little more. Even if I am only ever going to use VBA, I still always use the interactive tools first to make sure that everything works.
To start with I will export a single customer record to a file using the interactive Export to XML. You can do this by opening the Customer table, selecting a record and then clicking the Ribbon External Data – Export – XML File, giving the file a name, clicking Ok and completing the XML option dialog box. In this first example I do not want a schema nor the related tables, so the dialog box should look like this:
(Don’t be tempted to rush in and add the Orders table yet, you will probably not get what you want.)
The resultant XML file looks like this:
[sourcecode gutter=”False” language=”xml”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata">
<Customers>
<ID>1</ID>
<Company>Company A</Company>
<Last_x0020_Name>Bedecs</Last_x0020_Name>
<First_x0020_Name>Anna</First_x0020_Name>
<Job_x0020_Title>Owner</Job_x0020_Title>
<Business_x0020_Phone>(123)555-0100</Business_x0020_Phone>
<Fax_x0020_Number>(123)555-0101</Fax_x0020_Number>
<Address>123 1st Street</Address>
<City>Seattle</City>
<State_x002F_Province>WA</State_x002F_Province>
<ZIP_x002F_Postal_x0020_Code>99999</ZIP_x002F_Postal_x0020_Code>
<Country_x002F_Region>USA</Country_x002F_Region>
</Customers>
</dataroot>
[/sourcecode]
The first thing to notice is that the fieldnames in the Northwind Customers table contain spaces which lead to XML element names like “ZIP_x002F_Postal_x0020_Code” which is quite unfriendly. The other thing to notice is that there are many elements missing from the customer tables; Access discards any element where the value is Null, and this might cause problems in the receiving application; an Schema file (XSD) might solve this latter problem, but not all receiving applications will make use of this. The other problem is that the root of the document is decided for you – “dataroot”.
The “dataroot” issue cannot be avoided, but by using Queries rather than Tables we can make life a lot easier. The Northwind Customers table contains a field “ZIP/Postal code” I never like naming fields with embedded spaces or funny characters, I prefer to use the usual alphanumeric characters, the underscore character and camel case for naming Access objects.
So for example I would create a query, and for the ZIP column us
[/sourcecode]
The missing null elements can be turned into empty elements, by using the VBA NZ function on each field. in the query such as:
[sourcecode language=”SQL” gutter=”False”] SELECT NZ([Web Page],"") AS "WebPage", etc etc FROM Customers[/sourcecode]
If you want to use the same field name in the output you must explicitly include the table name otherwise you will create a circular reference.
[sourcecode language=”SQL” gutter=”False”] SELECT NZ([Customers].[Notes],"") AS "Notes", etc FROM Customers[/sourcecode]
You should not use the NZ or any other function on the Primary and Foreign Keys of a relation when you want nested otherwise you will get an obscure run-time error message.
The node names in the XML output are the same as the Table/Query names. Something you will have to live with, unless your tables and queries are named just how you want to see them in the XML. I use a “qry” prefix in front of all my Queries, so my query qryXMLCustomer gives me the following XML.
[sourcecode gutter=”False” language=”xml”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata">
<qryXMLCustomer>
<ID>1</ID>
<Company>Company A</Company>
<LastName>Bedecs</LastName>
<FirstName>Anna</FirstName>
<Email/>
<JobTitle>Owner</JobTitle>
<BusinessPhone>(123)555-0100</BusinessPhone>
<HomePhone/>
<MobilePhone/>
<FaxNumber>(123)555-0101</FaxNumber>
<Address>123 1st Street</Address>
<City>Seattle</City>
<State_Province>WA</State_Province>
<Zip_PostalCode>99999</Zip_PostalCode>
<CountryRegion>USA</CountryRegion>
<WebPage/>
<Notes/>
</qryXMLCustomer>
</dataroot>
[/sourcecode]
Other than the node name this is looking a lot more tidy than basing the XML on the Customers table.
The application that I will be using this XML with, Microsoft Word and Content Controls, is not particularly sophisticated in handling XML, so I can only use individual elements or repeating nodes. This means that I cannot use XPath to lookup data. So not only will I need an Access Query to overcome the problem discussed in the first part of this blog, but also to get all the lookup data that I will need such as Employee’s name, Shipper’s name and the Order Status. I can skip some fields if I do not think they will be necessary. On this point though, I generally try to make my Queries for producing XML to have wider usage than just the application at hand, so I tend to include most fields unless I am very sure I will not ever need it. This is what a node from my qryXMLOrder will look like:
[sourcecode gutter=”false” language=”xml”] <qryXMLOrder><OrderID>71</OrderID>
<EmployeeName>Nancy Freehafer</EmployeeName>
<CustomerID>1</CustomerID>
<OrderDate>24/05/2006</OrderDate>
<ShippedDate/>
<Shipper>Shipping Company C</Shipper>
<Ship>Anna Bedecs</Ship>
<OrderStatus>New</OrderStatus>
</qryXMLOrder>
[/sourcecode]
In the Northwind database the Customers and Orders table are joined in a relationship. When you Export the Customers table to XML on the previous page, you might have noticed that in the Options box you could choose related tables. For Queries you also need to establish a relationship
After setting a relationship if we open the qryXMLCustomer and select a customer and Export to XML the dialog box will give us the ability to include the qryXMLOrder.
So now lets take a look at some XML.
[sourcecode gutter=”false” language=”xml”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata">
<qryXMLCustomer>
<ID>1</ID>
<Company>Company A</Company>
<LastName>Bedecs</LastName>
<FirstName>Anna</FirstName>
<Email/>
<JobTitle>Owner</JobTitle>
<BusinessPhone>(123)555-0100</BusinessPhone>
<HomePhone/>
<MobilePhone/>
<FaxNumber>(123)555-0101</FaxNumber>
<Address>123 1st Street</Address>
<City>Seattle</City>
<State_Province>WA</State_Province>
<Zip_PostalCode>99999</Zip_PostalCode>
<CountryRegion>USA</CountryRegion>
<WebPage/>
<Notes/>
<qryXMLOrder>
<OrderID>44</OrderID>
<EmployeeName>Nancy Freehafer</EmployeeName>
<CustomerID>1</CustomerID>
<OrderDate>24/03/2006</OrderDate>
<ShippedDate/>
<Shipper/>
<Ship>Anna Bedecs</Ship>
<OrderStatus>New</OrderStatus>
</qryXMLOrder>
<qryXMLOrder>
<OrderID>71</OrderID>
<EmployeeName>Nancy Freehafer</EmployeeName>
<CustomerID>1</CustomerID>
<OrderDate>24/05/2006</OrderDate>
<ShippedDate/>
<Shipper>Shipping Company C</Shipper>
<Ship>Anna Bedecs</Ship>
<OrderStatus>New</OrderStatus>
</qryXMLOrder>
</qryXMLCustomer>
</dataroot>
[/sourcecode]
If you view the XML in a document outline view, you will see the qryXMLOrder node repeated in the qryXMLCustomer context.
If you have many records in the related tables or many varied tables at the same context level this might get untidy, I would prefer to have the orders group together in a single node so my outline would look something like
Because the Access Export to XML will not wrap up nodes like this directly we have to create another query, and change the relationship we established before. The Query will contain just one field the same as used in the relationship between the qryXMLCustomer and qryXMLOrder, it will act as a container for the various qryXMLOrder nodes
I will name this query qryXMLOrders (plural) and the SQL for this query is very simple:
[sourcecode gutter=”false” language=”sql”] SELECT Customers.ID AS CustomerIDFROM Customers;
[/sourcecode]
and the relationship:
the Export to XML dialog box
and finally some XML
[sourcecode gutter=”false” language=”xml”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata">
<qryXMLCustomer>
<ID>1</ID>
<Company>Company A</Company>
<LastName>Bedecs</LastName>
<FirstName>Anna</FirstName>
<Email/>
<JobTitle>Owner</JobTitle>
<BusinessPhone>(123)555-0100</BusinessPhone>
<HomePhone/>
<MobilePhone/>
<FaxNumber>(123)555-0101</FaxNumber>
<Address>123 1st Street</Address>
<City>Seattle</City>
<State_Province>WA</State_Province>
<Zip_PostalCode>99999</Zip_PostalCode>
<CountryRegion>USA</CountryRegion>
<WebPage/>
<Notes/>
<qryCustomerOrders>
<CustomerID>1</CustomerID>
<qryXMLOrder>
<OrderID>44</OrderID>
<EmployeeName>Nancy Freehafer</EmployeeName>
<CustomerID>1</CustomerID>
<OrderDate>24/03/2006</OrderDate>
<ShippedDate/>
<Shipper/>
<Ship>Anna Bedecs</Ship>
<OrderStatus>New</OrderStatus>
</qryXMLOrder>
<qryXMLOrder>
<OrderID>71</OrderID>
<EmployeeName>Nancy Freehafer</EmployeeName>
<CustomerID>1</CustomerID>
<OrderDate>24/05/2006</OrderDate>
<ShippedDate/>
<Shipper>Shipping Company C</Shipper>
<Ship>Anna Bedecs</Ship>
<OrderStatus>New</OrderStatus>
</qryXMLOrder>
</qryCustomerOrders>
</qryXMLCustomer>
</dataroot>
[/sourcecode]
For the Export to XML to work you cannot have multiple fields in the relationship between the tables/queries. The Primary Key of each table must be a single field.
Although you can create an empty element by converting Nulls to zero length spaces, you cannot force the output of empty nodes, and generally you would not want to, but the caution is that if you are creating a sample XML then you need to have output for all nodes. If the receiving application can use a Schema (XSD) then that is better than using a sample XML file, and you can create this with the Access Export to XML. However my intended application for this exercise is to create a merge document using Microsoft Word and Content Controls and this needs a sample XML file, but can’t use an XSD.
In a follow up blog, we will use this same example of interactively exporting XML and create the same using VBA, and venture into the mysteries of AdditionalData.
See also:
Part 2 – Export XML Data from Microsoft Access using VBA
Export XML Data from Microsoft Access – Part 3 – Matching a schema
Document Generation using Word Content Controls and Microsoft Access
Comments closed.