[Svnmerge] svnmerge.py init crashes (on bad characters in svn xml log)

Jean-Philippe Daigle Jean-Philippe.Daigle at SolaceSystems.com
Tue Oct 6 13:30:36 PDT 2009


I'm having a problem doing an "svnmerge.py init" on a branch to track
the trunk, apparently due to bad characters in a commit message. Knowing
what the problem is doesn't really help me find a solution that doesn't
involve the words "time machine", though.

Here's what I'm seeing:
$ svnmerge.py init svn://<MYPROJECT>/trunk
Traceback (most recent call last):
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 2366, in <module>
    main(sys.argv[1:])
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 2361, in main
    cmd(branch_dir, branch_props)
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 1813, in __call__
    return self.func(*args, **kwargs)
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 1349, in action_init
    cf_source, cf_rev, copy_committed_in_rev = get_copyfrom(source_url)
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 1077, in get_copyfrom
    % target, split_lines=False)):
  File "/cygdrive/c/Documents and Settings/jdaigle/My
Documents/bin/svnmerge.py", line 1034, in __getitem__
    for event, node in self._events:
  File "/usr/lib/python2.5/xml/dom/pulldom.py", line 232, in next
    rc = self.getEvent()
  File "/usr/lib/python2.5/xml/dom/pulldom.py", line 265, in getEvent
    self.parser.feed(buf)
  File "/usr/lib/python2.5/xml/sax/expatreader.py", line 211, in feed
    self._err_handler.fatalError(exc)
  File "/usr/lib/python2.5/xml/sax/handler.py", line 38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: <unknown>:356244:39: not
well-formed (invalid token)

Right, so it's crashing on line 356244 of an XML file. Using
Sysinternals procmon, I traced svn calls to get the exact call that
generated it. Svnmerge is calling 'svn --non-interactive log -v --xml
--stop-on-copy "svn://<MYPROJECT>/trunk"'. I run that command manually
and dump the log to a file, where line 356244 looks like this (the
mailing list may or may not preserve the quotes in the next line, but
they're critical to the problem):

<msg>For UDP publishers, if they check "Send invalid UDP messages over
TCP", then [...]

The hex value for these quotes is 0x93 and 0x94, the infamous "Microsoft
Smart Quotes" (don't know their real name), and they SOMEHOW made their
way into a commit log way back in 2006. Oops. I'm guessing it's illegal
for an XML document to contain these, and if that's so, then "svn log -v
--xml" is generating invalid XML and should be attempting to escape
these.

I don't believe there's any way to retroactively rewrite that commit
message, so does anyone have a suggestion?



More information about the Svnmerge mailing list