[Svnmerge] Re: svn commit: r19626 - trunk/contrib/client-side

Fri May 12 11:33:11 PDT 2006

Blair Zajac wrote:

> I'm thinking instead of using regex's over the XML, which granted is
> fast, a cleaner approach may be to use an XML parsing library.  I'm
> thinking of xmltramp, which generates a nice tree of dictionaries and
> lists using the XML, which then we could iterate over.
>
> http://www.aaronsw.com/2002/xmltramp/
>
> xmltramp has been around for a while, so it probably has support for
> older versions of Python.

The only issue is that of external dependencies. Do we want to require a
non-standard library to run svnmerge.py? Or are you proposing to still keep
the regexp code as a fallback? BTW, I'm -1 on replacing the current regexp
code with SAX/DOM-based parser, which would take 200 lines or something.

If we were to use a XML library, I'd rather go with ElementTree, which is
part of Python 2.5 (and thus the de-facto pythonic XML library). It works
under Python 1.5.2+, and it even supports iterator-based parsing
(http://effbot.org/zone/element-iterparse.htm) which would allow to parse
large log files without reading them fully into memory:

>>> from elementtree import ElementTree as ET
>>> import os
>>> f = os.popen("svn log -r7317 http://svn.collab.net/repos/svn --xml -v")
>>> for n in ET.parse(f).getroot():
...     print n.findtext("author")
...     print n.findtext("date")
...     print n.findtext("msg")
...
clkao
2003-10-06T14:53:45.803931Z
Avoid double-closing of SVN::Stream objects.

* perl/Core.pm: undef svn_stream after close.
-- 
Giovanni Bajo