comparison diff-colorize.py @ 26:3b33b1c48880

Changed the token expression to add support for sub-identifier word differencing. Example (this change causes only “Disable” and “Enable” to be highlighted): - CLVSuddenTerminationDisable(); + CLVSuddenTerminationEnable();
author Peter Hosey <hg@boredzo.org>
date Sat, 08 Jan 2011 01:21:51 -0800
parents 94e9ee861fc3
children 5f17911c4fe6
comparison
equal deleted inserted replaced
25:94e9ee861fc3 26:3b33b1c48880
172 def common_and_distinct_substrings(a, b): 172 def common_and_distinct_substrings(a, b):
173 "Takes two strings, a and b, tokenizes them, and returns a linked list whose nodes contain runs of either common or unique tokens." 173 "Takes two strings, a and b, tokenizes them, and returns a linked list whose nodes contain runs of either common or unique tokens."
174 def tokenize(a): 174 def tokenize(a):
175 "Each token is an identifier, a number, or a single character." 175 "Each token is an identifier, a number, or a single character."
176 import re 176 import re
177 # Identifier, binary number, hex number, decimal number, operator, other punctuation. 177 # Word in identifier, word in macro name (MACRO_NAME), binary number, hex number, decimal number, operator, other punctuation.
178 token_exp = re.compile('[_a-zA-Z][_a-zA-Z0-9]+:?|0b[01]+|0[xX][0-9A-Fa-f]+|[0-9]+|[-+*|&^/%\[\]<=>,]|[()\\\\;`{}]') 178 token_exp = re.compile('[_A-Z]*[_a-z0-9]+:?|_??[A-Z0-9]+:?|0b[01]+|0[xX][0-9A-Fa-f]+|[0-9]+|[-+*|&^/%\[\]<=>,]|[()\\\\;`{}]')
179 start = 0 179 start = 0
180 for match in token_exp.finditer(a): 180 for match in token_exp.finditer(a):
181 for ch in a[start:match.start()]: 181 for ch in a[start:match.start()]:
182 yield ch 182 yield ch
183 yield match.group(0) 183 yield match.group(0)