[Svnmerge] correct parsing of xml output
Mattias Engdegård
mattias at virtutech.se
Mon Jan 23 06:16:43 PST 2006
get_copyfrom() does not parse the output of xml log --xml correctly.
Regexes are greedy, so .* leads to mismatches (and incorrect behaviour).
We were just bitten by this bug.
--- svnmerge.py (revision 18197)
+++ svnmerge.py (working copy)
@@ -479,7 +479,7 @@
out = launchsvn('log -v --xml --stop-on-copy "%s"' % dir, split_lines=False)
out = out.replace("\n", " ")
try:
- m = re.search(r'(<path .*action="A".*>%s</path>)' % rlpath, out)
+ m = re.search(r'<path ([^>]*action="A"[^>]*)>%s</path>' % rlpath, out)
head = re.search(r'copyfrom-path="([^"]*)"', m.group(1)).group(1)
rev = re.search(r'copyfrom-rev="([^"]*)"', m.group(1)).group(1)
return head,rev
This improves it to the point of actually working in practice, but since >
is legal inside attribute values it is not completely correct.
Doing this correctly would make the regexp more complicated; the two
instances of [^>]* would have to be replaced by something like
([^>'"]*("[^"]*"|'[^']*'))*
where ' needs to be escaped somehow (I suggest using a triple-quoted
instead of "raw" string). Add this if you care enough.
Proper XML parsing would be best, but I suppose you have some reason for
not doing this (compatibility with old Python version perhaps).
Please apply this patch or the suggested variant.
More information about the Svnmerge
mailing list