view notes/metadata.txt @ 717:ae5968ffe6fe

svnwrap: fix handling of quotable URLs (fixes #197, refs #132) The way hgsubversion handles URLs that may or may not be quoted is somewhat fragile. As part of fixing issue 132 in 925ff8c5989c, the path component of URLs was always quoted. The URL has been attempted encoded since the initial check-in. The fix from 925ff8c5989c was incomplete; reverting it allows us to clone a URL with a '~' in it.[1] Encoding the URL as UTF-8 seldom works as expected, as the default string encoding is ASCII, causing Python to be unable to decode any URL containing an 8-bit character. The core problem here is that we don't know whether the URL specified by the user is quoted or not. Rather than trying to deal with this ourselves, we pass the problem on to Subversion. Then, we obtain the URL from the RA instance, where it is always quoted. (It's worth noting that the editor interface, on the other hand, always deals with unquoted paths...) Thus, the following invariants should apply to SubversionRepo attributes: - svn_url and root will always be quoted. - subdir will always be unquoted. Tests are added that verify that it won't affect the conversion whether a URL is specified in quoted or unquoted form. Furthermore, a test fixture for this is added *twice*, so that we can thoroughly test both quoted and unquoted URLs. I'm not adding a test dedicated to tildes in URLs; it doesn't seem necessary. [1] Such as <https://svn.kenai.com/svn/winsw~subversion>.
author Dan Villiom Podlaski Christiansen <danchr@gmail.com>
date Mon, 04 Oct 2010 21:00:36 -0500
parents ba801f44d240
children
line wrap: on
line source

Branches
--------
In order to handle branches that are not immediate children of /branches/, the
following information must be stored in the revmap:

revision path

Where path is the actual relative path of the branch in svn. An example, with
the previous format for clarification:
New                          | Old
3 <hash> trunk               | 3 <hash>
4 <hash> branches/foo        | 4 <hash> foo

Tags
----
Note that if a tag is committed to, we can handle that case by making a branch
and then marking it as deleted. The revmap line would look something like this:
10 <hash> tags/the_tag
And the commit would be done on the hg branch 'modified-tag/the_tag'. Note that 
if this was 'tags/releases/1.0.0', then the branch would be 
'modified-tag/releases/1.0.0'.

Detecting Closing of Branches
-----------------------------
Subversion users typically remove branches when done with them. This means that
if a commit performs a delete operation on the '' path inside a branch, we can
be sure that the branch no longer exists. The branch should then be marked as
inactive.

Closing Branches
----------------
As of this writing, Mercurial marks branches as inactive by merging them so they
have no active heads. In order to mark a branch as closed, the active head on
the branch will be used as the first parent of the new changeset, and the second
changeset will be either the active head on the hg branch 'closed-branches' or
be the nullrev. The commit to mark the branch as inactive will happen on the
'closed-branches' branch in Mercurial.

Recovery
--------
hgsubversion stores several pieces of essential metadata in .hg/svn/. In order
to rebuild this data, the key 'convert_revision' should be stored in the extra
dictionary of the converted revision. The key should contain data in the format:
'svn:<uuid>/abs/path@<rev>' where <uuid> is the repo UUID, /abs/path is the
absolute path to the location the edits were made in Subversion (that is, if it
was a trunk commit on /foo/trunk, then /foo/trunk is what gets stored here, even
though the project root does not equal the repo root), and <rev> is the revision
number of the change in Subversion. This key (and its contents) have been
chosen to be compatible with the convert extension so that repos originally
converted to hg using convert can be maintained using hgsubversion if desired.

Tags (tag_info) can be reconstructed by listing the tags directory and then
running log on each tag to determine its parent changeset. 

The last revision converted (last_rev) can be converted simply by using the
highest revision number encountered while rebuilding the revision map.

The legacy tag_locations file does not need to be replaced - it will be
obsoleted as part of the long-term branch refactor.

The url will have to be provided by the user. The uuid can be re-requested from 
the repository.

branch_info can be rebuilt during the rebuild of the revision map by recording
the revisions of all active heads of server-side branches. branch_info contains
a map from branch: (parent_branch, parent_branch_rev, branch_created_rev)