
You can use XML Schema to validate an XML document against a pre-defined set of rules. It is often referred to as XML Schema Definition (XSD). Since the document produced after parsing varies depending on the XML parser, it is recommended to use parser specific XML Schema for validation.
There are two approaches to validate XML documents against XSD.
The first one allows you to use a Validatior, which is available since JRE/JDK 1.5. This is a processor that checks the XML document against a XML Schema after the initial parsing of the document.
The second one enables you to use setSchema(Schema schema) method. As a result the validation happens during the parsing of the document.
Securing Schema Validation when Using a Validator
When you use a Validatior, the XML Schema Validation happens on the resulting document (created after the parsing), and thus it can be used independently by the XML parser.
Document document = dbFactory.newDocumentBuilder().parse(xmlFile); SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); Schema schema = schemaFactory.newSchema(xsdFile); Validator validator = schema.newValidator(); validator.validate(new DOMSource(document));
There is a performance impact depending on the size of the document and the XSD.
The document created after the parsing can differ depending on the type of the XML Parser in use. Therefore, the result of parsing one and the same XML document may be different when parsed with different parsers.
You can activate secure XML validation by using HardenedFacade and Validator class like in the example below:
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); schemaFactory = HardenedFacade.secureSchemaFactory(schemaFactory); Schema schema = schemaFactory.newSchema(new File(xsdFile)); Validator validator = schema.newValidator(); validator.validate(new DOMSource(document));
The returned schemaFactory is an instance of HardenedSchemaFactory, which is already secured. No further features or properties need to be set.
Validation During Parsing
You can also use HardenedFacade to validate your XML document against an XML Schema during parsing.
Method setSchema(Schema schema) of DocumentBuilderFactory really sets the schema to a factory that is instance of HardenedDocumentBuilderFactory only if the schema parameter is instance of HardenedSchema. This is the only way to ensure secure schema validation.
DocumentBuilderFactory dbFactory = HardenedFacade.secureDocumentBuilderFactory(dbFactory); SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI); schemaFactory = HardenedFacade.secureSchemaFactory(schemaFactory); Schema schema = schemaFactory.newSchema(xsdFile); dbFactory.setSchema(schema);
The method with parameters SchemaFactory and String namespace is used for external configuration file. The second parameter identifies which exactly configuration file to be used.
public static SchemaFactory secureSchemaFactory(final SchemaFactory factory, final String callerNamespace)
throws SAXNotRecognizedException, SAXNotSupportedExceptionCall as less as possible the secureSchemaFactory method because there are some resource consuming operations, which might affect the performance of your application. Try to reuse the SchemaFactory instance across classes instead.