Watch out when filtering resources with Maven

Boudewijn van Klingeren

At my current project we are using Maven 1.1 for creating build artifacts. Although a little bit outdated these days, it still works well. As Maven 2 this version also has the option of filtering resources (e.g. XML or properties files) to substitute certain keywords with values specified in properties files and/or the command line. We are using it for a number of reasons and it works perfectly… if properly used, off course.

After a normal update from SVN the other day, I tried to build the project. Everything went well until one of the unit tests unexpectedly failed with the message: “Invalid byte 3 of 3-byte UTF-8 sequence” when reading an XSD file for validation. At the same time a few colleagues of mine were investigating a weird Bamboo build error with a font file we are using in a GUI component. Not being aware that this could be originating of the same situation, I began to investigate my UTF error. After some googling around I found that it could have something to do with the encoding of the XSD file. We are using the system’s default XML parser, which is Apache Xerces, and when reading an XSD file it will read it with the encoding specified in the XML header. The encoding was specified as UTF-8 and the normal source file was indeed a UTF-8 file. So I checked the encoding of the version that was copied to the Maven target directory and it was ANSI. Hmm, weird…

To be short for the purpose of this blog, it appeared that the filtering of resources was changed recently. The following resources configuration was used for the failing build:

        
            
                src/resources
                true
                
                    **/*.properties
                    **/*.xml
                
            
            
                src/resources
                true
                
                    **/*.properties
                    **/*.xml
                
            
            
                src/java
                
                    **/*.properties
                
            
        

This configuration causes the following actions to be taken. First all resources with the extensions ‘.properties’ and ‘.xml’ in the directory ‘src/resources’ are copied and filtered (contents are checked and where needed substituted) to the target directory structure. Second, all resources in the directory ‘src/resources’ that do not have the extensions ‘.properties’ and ‘.xml’ are copied and filtered to the target directory structure. Third, all ‘.properties’ files in the ‘src/java’ directory are copied.

The mistake here is in the second resource configuration. The filtering option should not have been present here, because the combination of the first and second resource configuration in fact causes all resources to be filtered, which should not have been the case.

After removing the filtering option in the second resource configuration, the build was executing perfectly again. The fun thing is that after checking in the change the Bamboo build was also correct. My colleagues, who were still trying to resolve the font error, were quite amazed, as was I. For some reason Maven had messed up the files while filtering and copying. Quite bizarre if you think of the fact that the XSD file which caused my error didn’t have a keyword in it that should be substituted and the font file that caused the Bamboo error was binary…

To be complete, here is the solution that worked (with the necessary documentation):

        
            
            
                src/resources
                true
                
                    **/*.properties
                    **/*.xml
                
            
            
            
                src/resources
                
                    **/*.properties
                    **/*.xml
                
            
            
                src/java
                
                    **/*.properties
                
            
        

Comments (3)

  1. bn_ - Reply

    November 2, 2008 at 11:29 pm

    Is it possible someone edited the xsd with an editor and saved it in UTF-8 with a Byte Order Mark (BOM)? You can easily test this, look at the first 3 bytes of the xsd file, if this resembles a BOM Maven and filtering is not part of the problem, the original text editor was.

  2. Boudewijn van Klingeren - Reply

    November 3, 2008 at 10:52 am

    It is highly unlikely that that was the case, because after the change in the resource filtering the problem was gone. Also, it doesn't explain why an XSD with a UTF BOM is copied in ANSI encoding instead of UTF with a BOM, or does it?

  3. syllepsa - Reply

    May 23, 2009 at 2:03 am

    Thanks for the example. Finally I got it. It might be easier to understand when we explicitly set filtering for the second resource on false. The reason is that you disable filtering for all the resources except for the ones in the excludes tag.

Add a Comment