Title | P4DTI test suite with PyXML 0.8.3 fails without XHTML DTD file |
Status | closed |
Priority | essential |
Assigned user | Nick Barnes |
Organization | Ravenbrook |
Description | The P4DTI test suite uses Python's XML libraries and PyXML extensions to check the XHTML documentation which forms part of the P4DTI product sources (e.g. manuals, design documents, etc). When used with PyXML 0.8.3 (unlike PyXML 0.7.x), this part of the test suite fails because it can't the XHTML DTD named in the doctype element. The failure is reported in various ways, but the underlying error is this: ValueError: unknown url type: /tmp/DTD/xhtml1-transitional.dtd |
Analysis | A full backtrace of the failure looks like this: File "check_xhtml.py", line 1054, in check xml.sax.parse(path_or_stream, self, self) File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/__init__.py", line 31, in parse parser.parse(filename_or_stream) File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", line 109, in parse xmlreader.IncrementalParser.parse(self, source) File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/xmlreader.py", line 123, in parse self.feed(buffer) File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", line 216, in feed self._parser.Parse(data, isFinal) File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/expatreader.py", line 395, in external_entity_ref self._source.getSystemId() or File "/usr/local/lib/python2.2/site-packages/_xmlplus/sax/saxutils.py", line 515, in prepare_input_source f = urllib2.urlopen(source.getSystemId()) File "/usr/local/lib/python2.2/urllib2.py", line 138, in urlopen return _opener.open(url, data) File "/usr/local/lib/python2.2/urllib2.py", line 320, in open type_ = req.get_type() File "/usr/local/lib/python2.2/urllib2.py", line 224, in get_type raise ValueError, "unknown url type: %s" % self.__original ValueError: unknown url type: /tmp/DTD/xhtml1-transitional.dtd The problem is that the default XML parser (expatreader) provided by the PyXML library wants to access the DTD entity. In principle this is controllable by setting a parameter _parser.SetParamEntityParsing() or by a parser feature switch (xml.sax.make_parser().setFeature(xml.sax.handler.feature_external_ges, 0). Note that the XML parser doesn't care about the content of the DTD. The actual parsing is driven by the XML SAX handler which we provide. Providing an empty file in the location DTD/xhtml1-transitional.dtd fixes this problem, but it is better to turn off this parsing feature. |
How found | automated_test |
Evidence | Run the test! |
Observed in | 2.0.0 |
Introduced in | 1.1.4 |
Created by | Nick Barnes |
Created on | 2003-12-02 15:01:06 |
Last modified by | Nick Barnes |
Last modified on | 2003-12-02 15:05:12 |
History | 2003-12-02 NB Created. |
Change | Effect | Date | User | Description |
---|---|---|---|---|
66825 | closed | 2003-12-02 15:04:55 | Nick Barnes | Explicitly make an XML parser when we parse, so that we can turn off external entity reading. |