Export XML Data from Microsoft Access – Part 3 – Matching a schema
Previously
Export XML Data from Microsoft Access – Part 1 – Tricks and Traps
Export XML Data from Microsoft Access – Part 2 – using VBA
In the previous two parts of this series we found it easy to Export XML data from Microsoft Access either using the UI or VBA, but that the schema (or structure) of the output was controlled by Access. A typical look of an XML document generated by Access was:
[sourcecode gutter=”False” language=”xml”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata"
generated="2017-12-27T10:19:33">
<qryInvoiceHeader>
<ID>1</ID>
<Customer_x0020_Name>Company A</Customer_x0020_Name>
<Last_x0020_Name>Bedecs</Last_x0020_Name>
<First_x0020_Name>Anna</First_x0020_Name>
<Business_x0020_Phone>(123)555-0100</Business_x0020_Phone>
<Fax_x0020_Number>(123)555-0101</Fax_x0020_Number>
<Delivery_x0020_Address>123 1st Street</Delivery_x0020_Address>
<City>Seattle</City>
<State_x002F_Province>WA</State_x002F_Province>
<ZIP_x002F_Postal_x0020_Code>99999</ZIP_x002F_Postal_x0020_Code>
<Country_x002F_Region>USA</Country_x002F_Region>
<qryInvoiceLine>
…
…
etc
</qryInvoiceLine>
<qryInvoiceLine>
…
…
etc
</qryInvoiceLine>
</qryInvoiceHeader>
</dataroot> [/sourcecode]
So lets review some of the tips that were given in tidying things up:
- Use Queries to output the data
- Remove spaces and non-XML-friendly characters (eg punctutation) from the output names in the Query or replace them with an underscore. For example [CustomerName] in place of [Customer Name] or [Postal_Zip] in place of [Postal/Zip]
- If there are multiple related queries such as qryInvoice, qryInvoiceHeader, qryInvoiceLine then set the relationship between the Queries in the Access Relationship Editor
- Better still where appropriate create an intermediate query such as qryInvoiceLines with just one field to link between qryInvoiceHeader and qryInvoiceLine so you can better nest the output.
- Note that Access will only observe a relationship when exporting XML when there is a single value for the link field. For example CustomerId;InvoiceNo as the Primary key in the parent record will cause a problem.
- Use the NZ([value],””) function to make Null value fields appear as empty elements in the XML output, since Access will not output a Null value element – unless you want that behaviour.
- Use the format() function to format dates – Access does not directly format a date nicely in the XML – unless you like “2017-12-24T11:13:06”
This is all ok if you have control of the application that will use this data, but more often than not you will have to fit into a particular schema seemingly making Access quite useless for most XML applications. However with a little bit of effort this might not necessarily be the case.
Access supports XSLT. XSl (eXtensible Stylesheet Language) is a styling language for XML, and XSLT is for XSL Transformations. XSLT also uses another XML technology called XPath. XPath is used to select elements and attributes and navigate to the parts that you want to work on and XSL are the things you can do with it. The two work together. You can learn more about them at W3Schools but for what we are going to be doing you really do not need to become expert in these technologies. Most frequently XSLT is for transforming XML documents to create HTML documents for displaying in a web browser. But XSLT can be used to transform XML of one form, to XML of another form.
An example of an XSLT Template (taken from the W3Schools.com website) is
[sourcecode gutter=”False” language=”xml”] <?xml version="1.0"?><xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<body>
<h2>My CD Collection</h2>
<table border="1">
<tr bgcolor="#9acd32">
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each select="catalog/cd">
<tr>
<td><xsl:value-of select="title"/></td>
<td><xsl:value-of select="artist"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
An XML data source and an XSL template like above are passed into an XSLT processor, to produce a new HTML, XML, or text, et al, document
Using VBA we can ExportXML using:
[sourcecode gutter=”False” language=”vb”] Application.ExportXML ObjectType, _DataSource As String, _
[DataTarget As String], _
[SchemaTarget As String], _
[PresentationTarget As String], _
[ImageTarget As String], _
[Encoding As AcExportXMLEncoding = acUTF8], _
[OtherFlags As AcExportXMLOtherFlags], _
[WhereCondition As String], _
[AdditionalData])
[/sourcecode] The previous article discusses this.
As a second stage we can take the output of the ExportXML function and transform this into a new XML document. We can use the inbuilt function:
[sourcecode gutter=”False” language=”vb”]
Application.TransformXML _
DataSource as String, _
TransformSource as String, _
OutputTarget as String, _
Optional WellFormedXMLOutput as Boolean, _
Optional ScriptOption as AcTransformXMLScriptOption
[/sourcecode]
however I am going to suggest not using this as I have found it quite unreliable, and in its place I recommend using the MSXML2 library.
Although we may have a plan of how we want the final XML document to look, or we have to meet a Schema specification, Access will probably require a few additional fields so that it can tie the various nodes together. Rather than going into the detail of this, as it will be dependant on each case I am going to take a sample output of the Application.ExportXML and then show how this can be altered using XSLT.
So lets dive in.
[sourcecode gutter=”True” language=”vb”] <?xml version="1.0" encoding="UTF-8"?><dataroot xmlns:od="urn:schemas-microsoft-com:officedata"
generated="2017-12-28T05:11:15">
<qryCustomers>
<CustomerID>27</CustomerID>
<Company>Company AA</Company>
<Contact>Karen Toh</Contact>
<qryInvoices>
<InvoiceID>15</InvoiceID>
<InvoiceDate>24 March 2017</InvoiceDate>
<OrderID>30</OrderID>
<CustomerID>27</CustomerID>
<qryInvoiceLines>
<OrderID>30</OrderID>
<qryInvoiceLine>
<OrderID>30</OrderID>
<ProductName>Northwind Traders Almonds</ProductName>
<Quantity>10</Quantity>
<UnitPrice>30</UnitPrice>
</qryInvoiceLine>
<qryInvoiceLine>
<OrderID>30</OrderID>
<ProductName>Northwind Traders Cake Mix</ProductName>
<Quantity>5</Quantity>
<UnitPrice>3</UnitPrice>
</qryInvoiceLine>
<qryInvoiceLine>
<OrderID>30</OrderID>
<ProductName>Northwind Traders Cajun Seasoning</ProductName>
<Quantity>9</Quantity>
<UnitPrice>5</UnitPrice>
</qryInvoiceLine>
</qryInvoiceLines>
</qryInvoices>
</qryCustomers>
</dataroot>
[/sourcecode]
There are a few things that we will change:
- In line 2 the root element name from dataroot to CustomersHistory
- In line 2 remove the namespace attribute, the generated attribute can stay
- In line 3 change the element qryCustomers to Customer
- In lines 4 to 6 change the CustomerID, Company and Contact to be attributes of the Customer element
- in Line 7 change qryInvoices to Invoice
- in Line 12 change qryInvoiceLines to InvoiceLines
- in Line 13 remove OrderID element as it is redundant but was needed by Access for linking and exporting qryInvoiceLines as a child node of qryInvoices
- in Line 14 change qryInvoiceLine to InvoiceLine
- In Line 15 remove OrderID element as it is redundant but was needed byAccess for linking and exporting qryInvoiceLine as a child node of qryInvoiceLines, this also is the same for lines 21 and 27
Rather than use the inbuilt Application.TransformXML routine, I have found it is not fully supporting of XPath 1.0 and XSLT 1.0. So I prefer to us MSXML2 which despite its name, is also only supports version 1 of XPath and XSLT, although comprehensively.
I use the following function with mimics the inbuilt function
The function below mimics the Application.TransformXML
strXSLT As String, _
strOutputXML As String)
Dim oXmlFile As Object
Dim oXsltFile As Object
Dim oOutputFile As Object
Set oXmlFile = CreateObject("MSXML2.DOMDocument")
Set oXsltFile = CreateObject("MSXML2.DOMDocument")
Set oOutputFile = CreateObject("MSXML2.DOMDocument")
With oXsltFile
.async = False
.Load (strXSLT)
End With
With oXmlFile
.async = False
.Load (strInputXML)
.transformNodeToObject oXsltFile, oOutputFile
End With
oOutputFile.Save strOutputXML
End Sub
[/sourcecode]
There is no need to set a reference as this routine uses Late Binding although if you want to look more at the MSXML2 features and want to have Intellisense then you can add a reference to Microsoft XML, v6.0, and Dim oXMLFile etc etc as MSXML2.DOMDocument
We can then call this like this:
[sourcecode gutter=”False” language=”vb”]
TransformXML strInputXML:=CurrentProject.Path & "\temp.xml", _
strXSLT:=CurrentProject.Path & "\CustomersHistory.xslt", _
strOutputXML:=CurrentProject.Path & "\CustomersHistory.xml"
[/sourcecode]
So now we have to build the XSL Template.
The first thing we will do is create what is called an Identity Transformation. This copies the source to the target without any change, but it is a useful
save the following as copy.xslt
[sourcecode gutter=”False” language=”xml”]
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
[/sourcecode]
This work by matching all attributes and nodes, copy each one matched then applying the identity transformation work recursively though each element. It would have been sufficient to just match nodes as Access does not ExportXML as attributes. There are a few variations of the copy.xslt depending on what feature you want copied, eg processing instructions, comments etc.
The next stage is to create the XSL Template that we will use to transform the Access generated XML document.
Create a document, you can use Notepad, or an XML Editor such as Liquid XML, XMLPad, etc. The advantage of a good but basic XML Editor is that you can check that the XSL Template you are building is well formed and you can perform transformations without having to go back to Access to run the TransformXML subroutine.
Create a document name CustomersHistory.xslt and add this
[sourcecode gutter=”False” language=”xml”]
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="copy.xslt"/>
<xsl:strip-space elements="*"/>
<xsl:output method="xml" version="1.0" indent="yes" encoding="UTF-8"/>
</xsl:stylesheet>
[/sourcecode]
If you were to use this starter in the TransFormXML subroutine, as is, you would get a copy of the source XML in the output.
If we add the following in the blank area of the template, before the ending
[sourcecode gutter=”False” language=”xml”]
<!– Rename <dataroot> element to <CustomersHistory> –>
<xsl:template match="dataroot">
<CustomersHistory>
<xsl:apply-templates select="@*|node()"/>
</CustomersHistory>
</xsl:template>
[/sourcecode]
This matches the dataroot element and replaces it with the
and run the transformation again we will get the same document with the root element changed from dataroot to CustomersHistory
So now the full CustomersHistory.xslt with comments working through our wish-list higher up the page.
[sourcecode gutter=”True” language=”xml”]
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:import href="copy.xslt"/>
<xsl:strip-space elements="*"/>
<xsl:output method="xml" version="1.0" indent="yes" encoding="UTF-8"/>
<!– Rename <dataroot> element to <CustomersHistory> –>
<xsl:template match="dataroot">
<CustomersHistory>
<xsl:apply-templates select="@*|node()"/>
</CustomersHistory>
</xsl:template>
<!– Remove namespace declarations –>
<xsl:template match="*">
<xsl:element name="{name()}" namespace="{namespace-uri()}">
<xsl:apply-templates select="@*|node()"/>
</xsl:element>
</xsl:template>
<!– Rename <qryCustomers> element to <Customer> and move element to attributes –>
<xsl:template match="qryCustomers">
<Customer>
<xsl:for-each select="CustomerID|Company|Contact">
<xsl:attribute name="{local-name(.)}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:for-each>
<xsl:apply-templates select="@*|node()"/>
</Customer>
</xsl:template>
<!– Remove the elements that were moved to attributes above –>
<xsl:template match="qryCustomers/CustomerID|qryCustomers/Company|qryCustomers/Contact"/>
<!– Rename <qryInvoices> element to <Invoice> –>
<xsl:template match="qryInvoices">
<Invoice>
<xsl:apply-templates select="@*|node()"/>
</Invoice>
</xsl:template>
<!– Rename <qryInvoiceLines> element to <InvoiceLines> –>
<xsl:template match="qryInvoiceLines">
<InvoiceLines>
<xsl:apply-templates select="@*|node()"/>
</InvoiceLines>
</xsl:template>
<!– Remove the element qryInvoiceLines/OrderID, it was just a
requirement for Access to create a child node.
This matches it but does nothing with it, ie it deletes it –>
<xsl:template match="qryInvoiceLines/OrderID"/>
<!– Rename <qryInvoiceLine> element to <InvoiceLine> –>
<xsl:template match="qryInvoiceLine">
<InvoiceLine>
<xsl:apply-templates select="@*|node()"/>
</InvoiceLine>
</xsl:template>
<!– Remove the element qryInvoiceLine/OrderID, it was just a
requirement for Access to create a child node.
This matches it but does nothing with it, ie it deletes it –>
<xsl:template match="qryInvoiceLine/OrderID"/>
</xsl:stylesheet>
[/sourcecode]
and now the final transformed XML
[sourcecode gutter=”True” language=”xml”]
<?xml version="1.0" encoding="UTF-8"?>
<CustomersHistory generated="2017-12-28T05:11:15">
<Customer CustomerID="27" Company="Company AA" Contact="Karen Toh">
<Invoice>
<InvoiceID>15</InvoiceID>
<InvoiceDate>24 March 2017</InvoiceDate>
<OrderID>30</OrderID>
<CustomerID>27</CustomerID>
<InvoiceLines>
<InvoiceLine>
<ProductName>Northwind Traders Almonds</ProductName>
<Quantity>10</Quantity>
<UnitPrice>30</UnitPrice>
</InvoiceLine>
<InvoiceLine>
<ProductName>Northwind Traders Cake Mix</ProductName>
<Quantity>5</Quantity>
<UnitPrice>3</UnitPrice>
</InvoiceLine>
<InvoiceLine>
<ProductName>Northwind Traders Cajun Seasoning</ProductName>
<Quantity>9</Quantity>
<UnitPrice>5</UnitPrice>
</InvoiceLine>
</InvoiceLines>
</Invoice>
</Customer>
</CustomersHistory>
[/sourcecode]
Comments closed.