Mercurial > diff-colorize
comparison diff-colorize.py @ 26:3b33b1c48880
Changed the token expression to add support for sub-identifier word differencing.
Example (this change causes only “Disable” and “Enable” to be highlighted):
- CLVSuddenTerminationDisable();
+ CLVSuddenTerminationEnable();
author | Peter Hosey <hg@boredzo.org> |
---|---|
date | Sat, 08 Jan 2011 01:21:51 -0800 |
parents | 94e9ee861fc3 |
children | 5f17911c4fe6 |
comparison
equal
deleted
inserted
replaced
25:94e9ee861fc3 | 26:3b33b1c48880 |
---|---|
172 def common_and_distinct_substrings(a, b): | 172 def common_and_distinct_substrings(a, b): |
173 "Takes two strings, a and b, tokenizes them, and returns a linked list whose nodes contain runs of either common or unique tokens." | 173 "Takes two strings, a and b, tokenizes them, and returns a linked list whose nodes contain runs of either common or unique tokens." |
174 def tokenize(a): | 174 def tokenize(a): |
175 "Each token is an identifier, a number, or a single character." | 175 "Each token is an identifier, a number, or a single character." |
176 import re | 176 import re |
177 # Identifier, binary number, hex number, decimal number, operator, other punctuation. | 177 # Word in identifier, word in macro name (MACRO_NAME), binary number, hex number, decimal number, operator, other punctuation. |
178 token_exp = re.compile('[_a-zA-Z][_a-zA-Z0-9]+:?|0b[01]+|0[xX][0-9A-Fa-f]+|[0-9]+|[-+*|&^/%\[\]<=>,]|[()\\\\;`{}]') | 178 token_exp = re.compile('[_A-Z]*[_a-z0-9]+:?|_??[A-Z0-9]+:?|0b[01]+|0[xX][0-9A-Fa-f]+|[0-9]+|[-+*|&^/%\[\]<=>,]|[()\\\\;`{}]') |
179 start = 0 | 179 start = 0 |
180 for match in token_exp.finditer(a): | 180 for match in token_exp.finditer(a): |
181 for ch in a[start:match.start()]: | 181 for ch in a[start:match.start()]: |
182 yield ch | 182 yield ch |
183 yield match.group(0) | 183 yield match.group(0) |