Skip to main content
Skip table of contents

Ignoring Changes

Introduction

This document describes the concepts behind ignoring changes. For the resources associated with this sample, see the Bitbucket repo, here.

The purpose of a comparison is to show the all of the changes between two files. However there may be cases where you want to ignore changes from a comparison, such as when the changes are not relevant to you, or if they cause problems in further processing. Since release 5.1 of XML Compare, XSLT filters have been provided to allow you to ignore selected changes. From 17.0 release of XML Compare, in-built ignore changes functionality will allow you to ignore changes without using an additional filter. We encourage the use of the new API settings in most cases, but it will of course still be possible to continue ignoring changes using a filter.

What does "ignore" really mean?

First, we need to ask the question: What is meant by "ignore"?

Consider this simple example of attribute change:

Input A:

XML
<x y='1'/> 

Input B:

XML
<x y='2'/>

In this case ‘Ignore Changes’ could mean any of the following:

  • Prefer the 'B' value: <x y='2'/>

  • Prefer the 'A' value: <x y='1'/>

  • Take the ‘B' value if it exists; otherwise omit it from the output: <x y='2'/>

  • Take the 'A' value if it exists; otherwise omit it from the output: <x y='1'/>

  • Remove the change completely from the result: <x/>

Document Comparator API Settings

Using the JAVA API or the DCP

The following API settings can be used to configure the Document Comparator pipeline to ignore changes:

Setting

Description

ignoreChangesConfig

An in-built setting in the Document Comparator to allow you to ignore specified changes

locations

List of location objects with xpaths to elements or attributes to ignore changes on and also the result rules to be used to resolve the ignored locations in the result.

resultRule

The rule to resolve ignored locations in the result. It can have values listed below.

ResultRule Options

ResultRule

Description

ResultRule.BA

Default. Copy the B value if it exists, otherwise copy old value.

ResultRule.AB

Copy the A value if it exists, otherwise copy new value.

ResultRule.A

This copies A value if it exists, otherwise don’t output.

ResultRule.B

This copies B value if it exists, otherwise don’t output.

ResultRule.DELETE

Do not copy under any circumstances. For it removes the changes ( but still process the subtree if it exists.)

An example for ignoring changes to the @idattribute and log element is included below.

Example 1.1: Java API to mark parts of the resources/documentA.xml & resources/documentB.xml to be ignored (src/samples/IgnoreChangesWithAPISettingsSample.java in the sample on Bitbucket)
JAVA
DocumentComparator dc = new DocumentComparator();
// Here we set IgnoreChangeConfig into setIgnoreChangesConfig
List<Location> locations = new ArrayList<>();
locations.add(new Location("/addressBook/person/name/@id"));
locations.add(new Location("/addressBook/person/log", ResultRule.DELETE));
IgnoreChangesConfig ignoreChangesConfig = new IgnoreChangesConfig(locations);

dc.setIgnoreChangesConfig(ignoreChangesConfig);

// to generate your result file from comparison
dc.compare(input1, input2, new File(outputFileName));
Example 1.2: DCP to mark parts of the resources/documentA.xml & resources/documentB.xml to be ignored (src/dcp/ignore-changes.dcp in the sample on Bitbucket)
CODE
<standardConfig>
 <ignoreChangesConfig> 
   <locations> 
    <location ignoreXpath="/addressBook/person/name/@id"/>
    <location ignoreXpath="/addressBook/person/log" resultRule="DELETE"/>  
   </locations> 
 </ignoreChangesConfig>
</standardConfig>
Using namespaces whilst providing XPaths

When providing XPaths using the Java API (example 1.1) or the DCP (example 1.2) it may be useful to be able to use namespace prefixes to refer to elements within the input documents. For more information on how to define the namespaces to be used within this context please see Using Namespaces Within XPath Expressions.

Example Comparisons

This document discusses how to handle merges using two sets of input data; one data-centric and one document-centric. Two practical solutions are presented, one for each input data set, with each solution using a different comparator and method for customising a comparison:

  • Document Comparator - Uses Java API calls to customise a pre-existing pipeline with a number of extension points. The Document Comparator provides a solution tailored to comparing structured documents.

  • Pipelined Comparator (DXP) - Uses a filter pipeline defined by an XML file called a 'DXP' to customise the comparison.

Document Comparator

Imagine comparing the following two inputs, with the intention of ignoring the change made to the revision attribute of the author, and also the date elements:

Example 2.1: the author information from a DocBook file (document/documentA.xml in the sample on Bitbucket)
XML
<article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
         version="5.0">
  <info>
    <title>Ignore Changes Sample</title>
      <author revision="1.0">
        <personname>Joe Bloggs</personname>
        <address>
          <phone>+44 200 1234 567</phone> 
          <email>joe@blogs.com</email>
        </address>
        <personblurb><info></info><para></para></personblurb>
      </author>
  </info>
  <sect1>
    <title>Ignore Changes</title>
    <para><date>20141229</date>The input document for the ignore changes sample.</para>
  </sect1>
</article>
Example 2.2: an updated version of the author information with changed telephone numbers and updated dates (document/documentB.xml in the sample on Bitbucket)
XML
<article xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
         version="5.0">
  <info>
    <title>Ignore Changes Sample</title>
      <author revision="1.1">
        <personname>Joe Bloggs</personname>
        <address>
          <phone>+44 200 1235 890</phone> 
          <email>joe@blogs.co.uk</email>
        </address>
        <personblurb><info><date>01032008</date></info><para></para></personblurb>
      </author>
  </info>
  <sect1>
    <title>Ignore Changes</title>
    <para><date>20150105</date>The input document for the ignore changes sample.</para>
  </sect1>
</article>
Example 2.3: XML Compare Output from comparing Example 2.1 & 2.2:
XML
<article xmlns="http://docbook.org/ns/docbook"
  xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1" deltaxml:deltaV2="A!=B"
  deltaxml:word-by-word="false" version="5.0" deltaxml:version="2.1"
  deltaxml:content-type="full-context">
  <preserve:xmldecl xmlns:preserve="http://www.deltaxml.com/ns/preserve" deltaxml:ignore-changes="B"
    deltaxml:deltaV2="A=B" xml-version="1.0" encoding="UTF-8"/>
  <info deltaxml:deltaV2="A!=B">
    <title deltaxml:deltaV2="A=B">Ignore Changes Sample</title>
    <author deltaxml:deltaV2="A!=B">
      <deltaxml:attributes deltaxml:deltaV2="A!=B">
        <dxa:revision xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
          deltaxml:deltaV2="A!=B">
          <deltaxml:attributeValue deltaxml:deltaV2="A">1.0</deltaxml:attributeValue>
          <deltaxml:attributeValue deltaxml:deltaV2="B">1.1</deltaxml:attributeValue>
        </dxa:revision>
      </deltaxml:attributes>
      <personname deltaxml:deltaV2="A=B">Joe Bloggs</personname>
      <address deltaxml:deltaV2="A!=B">
      <phone deltaxml:deltaV2="A!=B">
        <deltaxml:textGroup deltaxml:deltaV2="A!=B">
          <deltaxml:text deltaxml:deltaV2="A">+44 200 1234 567</deltaxml:text>
          <deltaxml:text deltaxml:deltaV2="B">+44 200 1235 890</deltaxml:text>
        </deltaxml:textGroup>
        </phone>
        <email deltaxml:deltaV2="A!=B">
          <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">joe@blogs.com</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">joe@blogs.co.uk</deltaxml:text>
          </deltaxml:textGroup>
          </email></address>
      <personblurb deltaxml:deltaV2="A!=B">
        <info deltaxml:deltaV2="A!=B">
          <date deltaxml:deltaV2="B">01032008</date>
        </info>
        <para deltaxml:deltaV2="A=B"/>
      </personblurb>
    </author>
  </info>
  <sect1 deltaxml:deltaV2="A!=B">
    <title deltaxml:deltaV2="A=B">Ignore Changes</title>
    <para deltaxml:deltaV2="A!=B">
      <date deltaxml:deltaV2="A!=B">
        <deltaxml:textGroup deltaxml:deltaV2="A!=B">
          <deltaxml:text deltaxml:deltaV2="A">20141229</deltaxml:text>
          <deltaxml:text deltaxml:deltaV2="B">20150105</deltaxml:text>
        </deltaxml:textGroup>
      </date>The input document for the ignore changes sample.</para>
  </sect1>
</article>
.

Pipelined Comparator

Comparing the following two inputs, with the intention of ignoring the change made to the

CODE
 <person lastUpdated="01012008">
Example 3.1: a small address book as an XML file (documentA.xml in the sample on Bitbucket)
XML
<addressBook>
  <person lastUpdated="01012008">
    <log/>
    <name>Joe Blogs</name>
    <telephone>01234 567890</telephone>
    <email>joe@blogs.com</email>
  </person>
</addressBook>
Example 3.2: an updated version of the address book (documentB.xml in the sample on Bitbucket)
XML
<addressBook>
  <person lastUpdated="01022008">
    <log>
      <lastLoggedIn>01032008</lastLoggedIn>
    </log>
    <name>Joe Blogs</name>
    <telephone>01235 467890</telephone>
    <email>joe@blogs.co.uk</email>
  </person>
</addressBook>
Example 3.3: XML Compare Output from comparing Example 3.1 & 3.2:
XML
<addressBook xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
             xmlns:dxx="http://www.deltaxml.com/ns/xml-namespaced-attribute"
             xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
             deltaxml:deltaV2="A!=B"
             deltaxml:version="2.0"
             deltaxml:content-type="full-context">
   <person deltaxml:deltaV2="A!=B">
      <deltaxml:attributes deltaxml:deltaV2="A!=B">
         <dxa:lastUpdated deltaxml:deltaV2="A!=B">
            <deltaxml:attributeValue deltaxml:deltaV2="A">01012008</deltaxml:attributeValue>
            <deltaxml:attributeValue deltaxml:deltaV2="B">01022008</deltaxml:attributeValue>
         </dxa:lastUpdated>
      </deltaxml:attributes>
      <log deltaxml:deltaV2="A!=B">
         <lastLoggedIn deltaxml:deltaV2="B">01032008</lastLoggedIn>
      </log>
      <name deltaxml:deltaV2="A=B">Joe Blogs</name>
      <telephone deltaxml:deltaV2="A!=B">
         <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">01234 567890</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">01235 467890</deltaxml:text>
         </deltaxml:textGroup>
      </telephone>
      <email deltaxml:deltaV2="A!=B">
         <deltaxml:textGroup deltaxml:deltaV2="A!=B">
            <deltaxml:text deltaxml:deltaV2="A">joe@blogs.com</deltaxml:text>
            <deltaxml:text deltaxml:deltaV2="B">joe@blogs.co.uk</deltaxml:text>
         </deltaxml:textGroup>
      </email>
   </person>
</addressBook>

Example 3.3 above shows the deltav2 output of comparing Examples 3.1 and 3.2. The deltaV2 attributes mark the differences between the two files for example an element that has the attribute deltaxml:deltaV2="A=B" hasn’t changed between the two files. While this may look overly complicated for such a simple change, it makes processing of the change considerably easier. A side-effect of attribute changes being represented as elements is the addition of the dxa namespace, this is due to the namespace of a non-qualified attribute not being that of the document but an anonymous namespace which needs to be represented. 

Marking Data to Ignore

Next we need to mark our data to be ignored, this is achieved by placing the deltaxml:ignore-changes attribute on the following:

  • to ignore an attribute change: on the appropriate child of deltaxml:attributes which is representing the attribute you wish to ignore,

  • On the top most node in the sub-tree with a deltaxml:deltaV2 attribute, to ignore a sub-tree change,

  • to ignore a text change: on the deltaxml:textGroup.

By placing the deltaxml:ignore-changes='B,A' attribute, you’re instructing apply-ignore-changes XSLT to change the delta of the modification to be unchanged and to copy the B version. If does not exits in the B version (i.e. in the case of a deletion) the A version is used. This behaviour can be controlled by using a different value for the deltaxml:ignore-changes attribute, the legal values are shown below:

deltaxml:ignore-changes Value

Description

"B,A" or "true"

Default. Copy B if it exists, otherwise copy A.

"A,B"

Copy A if it exists, otherwise copy B.

"A"

Copy A if it exists, otherwise don’t output

"B"

Copy B if it exists, otherwise don’t output

""

Don’t copy under any circumstances (but process the subtree if present).

The ignore-changes attribute can be added using an XSLT stylesheet.

Since release 17.0, With the Document Comparator, you can define which elements and attributes to ignore changes for and set the result rules to apply to these locations using either the Java API or the DCP.(as detailed here). We recommend this method of marking for ignore changes when using Document Comparator

Note that if you want to ignore specific changes to comments or processing instructions, you will need to change the lexical preservation settings on the Comparator. See the Preserving Processing Instructions and Comments sample for more information.

Document Comparator

Writing XSLT stylesheet to add deltaxml:ignore-changes attribute

An example for ignoring changes to the version attribute and date elements is included below.

Example 1.1: an XSLT stylesheet to mark parts of the DocBook document to be ignored (document/mark-ignore-changes.xsl in the sample on Bitbucket)
XML
<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
                xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1"
                xmlns:docbook="http://docbook.org/ns/docbook"
  >
  
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="deltaxml:attributes/dxa:revision">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'true'"/>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="docbook:personblurb/docbook:info[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'true'"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="docbook:para/docbook:date[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="''"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

After the delta has been marked with the changes that should be ignored, using a filter similar to the one above, running apply-ignore-changes.xsl and then propagate-ignore-changes.xsl will process the delta, ignoring the marked data. The filter dx2-extract-version-moded.xsl is imported by apply-ignore-changes.xsl. All of these filters are supplied with versions of XML Compare 5.1 and later.

The examples used in this document are available in the Ignoring Changes repo (suitable for versions 5.1 and above) and in the Ignore Changes with API settings repo (suitable for versions 17.0 and above using Document Comparator) on Bitbucket.

  • The first sample shows how to ignore both element and attribute change and provides two examples - one using the Pipelined Comparator and one using the Document Comparator - of how to construct the pipeline of appropriate output filters described here.

  • Similarly, second sample shows how to ignore both element and attribute changes while using the Document Comparator by configuring the pipeline using JAVA API and also the DCP. Also, sample outputs are also included in this sample.

Pipelined Comparator

An example of ignoring changes to the

XML
<xsl:template match="deltaxml:attributes/dxa:lastUpdated"> and  <xsl:template match="log/lastLoggedIn[@deltaxml:deltaV2]">

element is shown below.

Example 2.1: an XSLT stylesheet to mark the elements and attributes to be ignored (mark-ignore-changes.xsl in the sample on Bitbucket)
XML
<xsl:stylesheet version="2.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:dxa="http://www.deltaxml.com/ns/non-namespaced-attribute"
                xmlns:deltaxml="http://www.deltaxml.com/ns/well-formed-delta-v1">
  
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="deltaxml:attributes/dxa:lastUpdated">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'B,A'"/>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="log/lastLoggedIn[@deltaxml:deltaV2]">
    <xsl:copy>
      <xsl:attribute name="deltaxml:ignore-changes" select="'B,A'"></xsl:attribute>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

Running the sample code

For the resources associated with the first sample, see the Bitbucket repo here. For the Document Comparator specific API and DCP sample, see the Bitbucket repo here.

For both these sample, download the sample resources into the XML Compare release directory under the samples directory. The resources should be located such that they are two levels below the top level release directory that contains the jar files. For example DeltaXML-XML-Compare-x_y_z_j/samples/IgnoreChanges.

Full instructions for running the sample are given in the file README.md , this can be found in Bitbucket repo and API settings repo

Ignore processing in further detail

This section provides some rules and further details about how ignore change processing and particularly how the apply-ignore-changes.xsl filter works.

Every element in the post-comparison XML tree has an 'effective' deltaxml:deltaV2 attribute which (a) specifies which of the inputs it was present in and (b) whether or not the elements were identical, if present in both inputs. The word effective is used because if you are in an unchanged, added or deleted sub-tree the deltaV2 attribute may only be on an ancestor element.

An element may also have an ancestor ignore-changes attribute, the closest ancestor is used when determining whether an element is included in the result.

Like most filters, some data flows through unaffected. In this case, if an element does not have an ancestor ignore-changes attribute it is copied to the result as-is.

When it does have an ancestor ignore-changes attribute, the following table specifies whether that element appears in the result:

delta/ignore-changes

''

A

B

A,B

B,A/true

A

-

-

B

-

-

A=B

-

A!=B

-

The only difference in behaviour for A,B vs. B,A occurs at the leaves of the XML tree (i.e. for changed text and attributes).  When there are two possible text values in a textGroup or two possible attribute values then the choice between these settings determines which of two values is used in the result.

Ignore changes and attributes

There are some issues related to the closest ancestor rule outlined above when considering attributes.  Attributes need to be attached to their parent element.  If the ignore-change settings specify that an element is not included, neither are any of its attributes irrespective of their ignore change settings. Here is an example:

XML
<x deltaxml:deltaV2='A!=B' deltaxml:ignore-changes=''>
  <deltaxml:attributes deltaxml:deltaV2='A!=B'>
    <dxa:y deltaxml:deltaV2='A!=B' deltaxml:ignore-changes='B'>
      <deltaxml:attribute deltaxml:deltaV2="A">12</deltaxml:attribute>
      <deltaxml:attribute deltaxml:deltaV2="B">24</deltaxml:attribute>
    </dxa:y>
  </deltaxml:attributes>
</x>

Normally we would expect y='24' to appear in the result if we look solely at the attribute and its local ignore-changes and deltaV2 attributes. However, the ignore-changes setting on the element x means that the attribute has lost its associated parent element and therefore cannot appear in the result.

Ignore changes and element removal

It is possible to use ignore changes at the element level as well as for simple attribute and text data. This is used for merging as discussed below and can also be used to remove elements from the result.  Here are two examples, firstly removing a child element:

XML
<x deltaxml:ignore-changes="true" deltaxml:deltaV2="A!=B">
  <y deltaxml:deltaV2="A">
     <z deltaxml:ignore-changes='B'/>
  </y>
</x>

In the above example the ignore-changes setting prevents the z element appearing in the result.  Note that as well as occurring at the bottom of a hierarchy this can also appear with a hierarchy,  here is another example:

XML
<chapter deltaxml:deltaV2="A!=B">
  <section deltaxml:deltaV2="A" deltaxml:ignore-changes='B'>
    <pagebreak deltaxml:ignore-changes='A'/>
  </section>
</chapter>

The ignore-changes settings preclude the section appearing in the result, but the same is not true for the pagebreak element, which is effectively promoted in this result of the filter:

XML
<chapter deltaxml:deltaV2="A!=B">
  <pagebreak deltaxml:ignore-changes='A=B'/>
</chapter>

How to merge two documents using deltaxml:ignore-changes

This section has been moved to Creating a Merged Document

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.