comparison hgsubversion/svnwrap/subvertpy_wrapper.py @ 931:e1dbd9646d6a

svnwrap: use custom StringIO class in get_file() The wrappers were calling ra.get_file() with a cStringIO object. Empirically, svn 1.7.5 is writing 16kB blocks to the stream object, and cStringIO reallocates its internal buffer and doubles its size whenever it is filled. With large committed files this requires two large memory blocks at the same time. SimpleStringIO implements the mimimum StringIO interface used by ra.get_file() but instead stores all the blocks and "join" them at the end. It means more fragmentation but requires only one large block, without overallocation. Also, 16kB blocks should be friendly to most allocators. In practice, this simple change let me convert a revision containing multiple moderately large files, the largest being around 450MB, with a 32-bits Windows setup, python 2.7, swig svn 1.7.5, in stupid mode, while it was previously aborting with "not enough memory". The same revision still fails in replay mode.
author Patrick Mezard <patrick@mezard.eu>
date Sun, 16 Sep 2012 19:31:49 +0200
parents 772280aed751
children 1de83496df4e
comparison
equal deleted inserted replaced
930:5bacb9c63e3e 931:e1dbd9646d6a
470 otherwise. If the file does not exist at this revision, raise 470 otherwise. If the file does not exist at this revision, raise
471 IOError. 471 IOError.
472 """ 472 """
473 mode = '' 473 mode = ''
474 try: 474 try:
475 out = cStringIO.StringIO() 475 out = common.SimpleStringIO()
476 rev, info = self.remote.get_file(path, out, revision) 476 rev, info = self.remote.get_file(path, out, revision)
477 data = out.getvalue() 477 data = out.getvalue()
478 out.close() 478 out.close()
479 if isinstance(info, list): 479 if isinstance(info, list):
480 info = info[-1] 480 info = info[-1]