Recently I wrote up an overview of the various options available for automating subversion within Perl. I'm rather new to this topic (and to posting things about Perl), so I'd very much appreciate feedback on what I've written. Specifically,
There are two ways to automate SVN with Perl:
use the system(“svn”, ...) to call the SVN binary
using Alien::SVN – a perl wrapper around the SVN API
Calling system(...) is the easier solution, but it is not portable across operating systems and it limits you to the functionality already built into the svn client command. This is not the full set of functionality provided by subversion. Finally, the subversion team has promised full backward compatibility for each minor version of the subversion API. It makes no such promises for the command line interface. If none of these are issues for you, you can stop reading now: the system(...) command is your best and fastest option.
The SVN API gives you a lot of flexibility, especially if you need to work with lots of files at once. Even if all you need is the standard svn client commands, you still may want to think twice about using system(...). If portability is important to you, you will need to wrap the calls to system(...) in code that detects and customizes the call for the current operating system.
The system(...) command requires a command name as its first parameter. Operating systems vary in the way they name and find commands. You will need to include operating system specific code to insure that the command can be found. You will also need to choose the name of the command in an operating system specific manner. On MS operating systems the subversion client command is svn.exe. On *nix it is just plain svn.
Finding your way around Alien::SVN (also known as SVN::Core) for the first time can be intimidating. This article provides a basic overview that should make the Perl API for subversion a bit more approachable.
Alien::SVN is a very thin wrapper around the perl API. It is organized into classes that reflect the major subsystems inside the C/C++ version of the API. There are six such classes:
SVN::Client – API equivalent of the command line SVN client, svn. Corresponds to svn_client.h.
SVN::Repos – API equivalent of the svnadmin command.
SVN::Ra – object representing the currently selected repository access method
SVN::Wc – object providing information about an item in a working copy
SVN::Fs – object representing the storage backend used by a specific repository
SVN::Delta – create, view and manipulate descriptions of changes made by a specific repository revision
The first two of these classes, SVN::Client and SVN::Repos are API analogs to the command line binaries: svn and svnadmin. SVN::Ra can be used to provide server side support for incoming requests for remote repository access. The remaining classes define objects that are used either as parameters or return values to those two classes.
For an overview of the subsystems of the SVN API, see the subversion book. The editions for subversion 1.3 and up describes the subsystems in chapter 8, titled “Embedding Subversion”. Versions for earlier subversion releases (1.0 – 1.2) title this same chapter “Developer Information”.
There are two subsystems that are used exclusively by subversion repository clients and perl modules that support them – Svn::Client and Svn::Wc.
SVN::Client provides access to commands to add, view, remove, and modify the content of a subversion repository. It is the API analog to the command line SVN client, svn (*nix) or svn.exe on Microsoft based operating systems.
The functions on the SVN::Client object can be classed into three groups:
configuration functions – setter methods that configure the behavior of an SVN::Client instance.
repository UUID functions – return the repository UUID
command functions – methods corresponding the various svn subcommands, e.g. copy, commit, ls, cat, propget, revpropget, etc.
The behavior of a SVN::Client object is customized primarily through call back functions. Most of these can be set via the SVN::Client object's constructor or one of its setter methods:
status – this callback is passed to the $svnclient->status(....) method.
notify – this callback is defined as part of the SVN::Client configuration and can be set using the notify(...) method. It is called in all of the following contexts:
after each file or directory that is added via add(...) and after each directory that is made via mkdir(...).
after a file is committed, copied, deleted, exported, imported, or merged. It is also called when conflicts are resolved via resolved(...) ;
after an attempt is made to query an unversioned file via log(...) for revision history
after each file that is revised or added because of a call to update(...)
after each file that is added, deleted, or changed because of a call to switch(...)
after status(...) retrieves a file “stored” in the repository via svn::externals.
When files are moved, it is called twice – once after a copy of the file is added, and once after the copy source is deleted.
log_msg – this callback is also part of the SVN::Client configuration. it is set using the log_msg(...) method. It is called to generate log messages during the commit process.
cancel – this callbck is also part of the configuration and is set via the cancel(...) method. It is called whenever subversion needs to ask the user for permission to cancel an operation.
SVN::Wc - defines objects and enums used to pass information to and from call back functions:
svn_wc_status_t – describes the status of a particular file found in the repository. This object is passed as a paramter to the status callback
SVN::Wc::Status::XXX – each XXX is a constant that describes the status of a file found in the working copy of a repository. This constant is passed in as one of the parameters to the status callback.
svn_wc_entry_t – describes the repository attributes of an item stored in the working copy. This is part of the svn_wc_status_t object and is returned by its entry() method.
SVN::Wc::Schedule::XXX – each XXX is a constant that describes the primary action scheduled for an item in the working copy. It is used together with methods like copied() to indicate the set of actions that will be taken when the item is eventually comitted to the repository. This constant is returned by the the schedule() method of svn_wc_entry_t.
SVN::Wc::Notify::Action::XXX – each XXX is a constant that describes the action that triggered a call to the notify callback. This constant is passed in as the second parameter of the notify callback.
SVN::Wc::Notify::State::XXX – each XXX is a constant that describes the state of the file that triggered a call to the notify callback. This constant is passed in as one of the parameters to the notify callback. This constant is passed as the fifth parameter of the notify callback.
The SVN::Repos, SVN::Ra and SVN::Fs subsystems are used exclusively by perl modules that support and customize local and remote repository access.
SVN::Repos – provides access to commands to create, open, and administer a repository. It is the API analog to the svnadmin command. Whereas SVN::Client operates on a specific revision or set of revisions, SVN::Repos commands operate on the repository as a whole. Repository manipulation commands include hook definition, access to node and commit editors, initialization and clean up of transactions, and dumping and loading of repository content and metadata.
SVN::Repos objects are only used to access a repository locally on the current machine. If you are developing tools to access the repository directly, you can create a repository object via one of two functions defined in the SVN::Repos package:
SVN::Repos::open(...) opens an existing repository
SVN::Repos::create(...) creates a new repository
Both methods return an SVN::Repos object which can be used to manipulate and extract data from an open repository. For documentation of repository manipulation methods, see the file svn_repos.h in the C/C++ API documentation. Any method defined in that file is a method of an SVN::Repos object if it takes svn_repos_t as its first parameter.
If you are developing Perl tools to support the server side of remote access, you won't work with SVN::Repos directly. Instead you use an SVN::Ra object. This method provides access to the same commands but insures that the user and system requesting remote access has rights to do what it wants.
The code that creates an SVN::Ra object on the server side is usually part of a module that has been registered on the web-server as the handler for URI's that match a particular pattern or protocol. Inside this module, an instance of SVN::Ra is created. Its methods can be used to perform various standard and custom actions on the repository.
SVN::Ra provides an object representing the client and server sides of remote repository access. This obect combines data specific to a user requesting remote access and data specific to a repository with an implementation for a specific repository method.
SVN supports many kinds of remote access and new ones can be created by defining remote acess plug-ins. Remote access plugins provide support for specific remote access mechanisms. Recent versions of subversion, including the versions supported by Alien::SVN (1.4.6) have well developed support for remote access via the file system, Web-DAV/SSL, ssh, and a properietary svnserve protocol.
SVN::Fs provides an object representing the file system that backs a repository. SVN supports more than one kind of repository storage mechanism. Recent versions of subversion, including the versions supported by Alien::SVN support both Berkley DB or a FSFS backends.
Full access to the file system is only available using local access methods. The SVN::Repos::fs() function returns an instance of this object. No such method exists on SVN::Ra. SVN::Ra provides only indirect access to the file system.
SVN::Delta defines objects and constants needed to create, view, and edit deltas.
Each time subversion commits a group of files it captures all of the changes made by those files in a delta. The delta is defined in such a way that we can always construct a later revision from an earlier revision plus a delta. Similarly given two revisions we can always construct whatever delta is needed to move forward in time from the earlier revision to the later revision. Deltas, however, are not reversible – they cannot be used to go backwards in time from the later revision to the earlier revision.
You won't need to work with deltas unless you plan to write a customized commit command, a remote access method or some other command that studies and reports on how the repository and its files have changed over time. Deltas are used by three types of processes: processes that generate commit requests, processes that receive and apply commit requests, and processes that massage, import or export data stored in a repository.
Svn::Client generates deltas whenever a group of files is committed. It compares the most recently committed version of each file to the currently uncommitted version of each file. The delta is then sent onto the repository as the commit request. The methods used by this delta creation process are all functions defined within the SVN::Delta namespace. Full documentation can be found by looking at the source code for svn_delta.h
When an SVN::Ra object recieves a commit request, it extract a delta from it and applies the delta to the existing version, thus creating the new version. The functions used to apply the delta are also declared in svn_delta.h and documentation can be found by looking at the source code for svn_delta.h.
The subversion repository itself stores version history of each file as a set of deltas. By applying N deltas to revision X we get revision X + N. Consequently, deltas are also used when importing, viewing, or exporting the change history of a repository. They are also used to update working copies from the repository and by diff commands.
All of these commands work with deltas via a SVN::Delta::Editor object. Both SVN::Ra and SVN::Repos have methods that accept a customized instance of SVN::Delta::Editor and use it to scan the repository's set of deltas. Its behavior can be customized by overriding methods that define which deltas are interesting and what to do with them.
Alien::SVN provides three other modules in addition to the subsystem classes,:
SVN::Core – enumerations and definitions for data structures shared by two or more of the subsystems.
SVN::Base – perlifies access to portions of the C/C++ API libraries. You won't normally use this class directly – instead it is used as the base class for all of the other classes.
Alien::SVN – not really a class as much as a place holder. If this appears in the depends list of a CPAN package, CPAN will automatically fetch the files needed for the Perl SVN API. It also downloads the source code for subversion 1.4.6 and compiles the API libraries. (The command line binaries are not compiled since Alien::SVN doesn't use them.
Most of the API functions accept a “pool” parameter. A pool is a large chunk of memory that can be allocated and deallocated as a single unit or broken up into small chunks. It is a convenient way to allocate tiny bits of data that need to be cleaned up as a group. As such, it is especially well suited to recursive algorithms which typically allocate lots of data for child nodes and then need to clean up all of those nodes as a group. Subversion has a lot of recursive programming and pools are especially suitable for its internal architecture.
Pools are one of many data structures and resources that subversion has borrowed from the Apache Portable Runtime library – an OS independent library for performing various “system” like functions, including memory allocation and release.
You can find more about how subversion uses pools by looking in the subversion book. For versions 1.3 and above, the relevant section in chapter 8 is titled “The Apache Portable Runtime Library”. For earlier subversion versions (1.0-1.2), memory pools are discussed in a section titled ”Programming with memory pools”.
The subversion API perl binding provide access to pool objecgtvs via the SVN::Pool class.
By default, exceptions thrown by the Subversion API cause Alien::SVN to croak. To override this behavior with custom exception handling create a custom subclass of svn_error_t and assign it to the $SVN::Error::handler global variable.
The SVN::Core module recommends that those interested in defining a custom error handler take a look at the implementation of SVN::Error::croak_on_error and SVN::Error::expanded_message. One should also look at the C/C++ API documentation of svn_error_t.
The documentation for the Perl SVN API is somewhat sketchy. For more detailed information about each of its modules, you might wish to look at the C/C++ API documentation, found here. The Perl documentation for Alien::SVN is sometimes sketchy, but it usually preserves the names of functions and constants used in the C/C++ API and notes the exceptions.
The following table summarizes the mapping betwen C/C++ API documentation and the subsystem modules:
SVN module | C/C++ header file |
SVN::Client | svn_client.h |
SVN::Repos | svn_repos.h |
SVN::Ra | svn_ra.h |
SVN::Fs | svn_fs.h |
SVN::Wc | svn_wc.h |
SVN::Delta | svn_delta.h |
A few especially important C/C++ APIs also have a perl oriented name: e.g. SVN::Delta::Editor is the perl name for svn_delta_editor in the C/C++ API.
All of the other C/C++ API types, classes and variables use the same name in both the C/C++ API and in the perl bindings. For example, both the C/C++ API call the custom error handler class svn_error_t.
For more information, please see the documentation of the various files in the Alien::SVN distribution.
Subversion functions get around portability problems by using relative and absolute URIs. Working copies of files and directories may be identified using either relative or absolute URIs. Relative URIs are always resolved relative to the current directory.
Files and directories with a repository are identified using fully qualified URIs. In these URIs the protocol identifies the repository access method. The initial portion of the path identifies the repository. The remainder identifies a file or directory within the repository. These URIs are expected to use UTF8 encoded canonical form.
All Uris are in canonical form using UTF8 encoding. To properly produce paths for subversion you can use the following Perl CPAN modules:
URI::Escape – see particularly the functions uri_escape_utf8(...) and uri_unescape_utf8(...).
URI – see particularly the method $uri->canonical(...).
One of the best ways to comfortable with using an API is to look at applications that have already been written using that API. Looking at existing applications is also a good way to avoid reinventing the wheel. CPAN has about 40 subversion related distributions, 15 of which use Alien::SVN to work with a subversion repository:
CPAN distribution | Description |
SVN-S4 | Subclasses SVN::Client by adding various helper functions. |
SVN-Web | A web based front end to subversion respositories. It supports browsing revisions, visewing diffs, blames, and annotations; generating RSS feeds of commits; stepping through revisions and much more |
SVN-RaWeb-Light | A light-weight web-based browser for a subversion repository. |
Catalyst-Model-SVN | Subversion repository browser for use with Catylyst managed websites. |
SVN-Log | uses Alien-SVN where available and svn binaries dumps change logs from the subversion server |
SVN-Log-Index | Generates a KinoSearch index for SVN log files, enabling full text search of those files. |
SVN-Churn | Generates a graph tracking the number of changed lines over time in a subversion repository. Uses SVN-log to extract change logs |
Log-Accounting-SVN | Generates summary statistics reports for logs retrieved by SVN-Log |
SVN-Txn-Props | Ties transaction properties to a perl hash so that transactions can be programmatically viewed and manipulated, perhaps within a pre-commit hook. |
SVN-Simple | A simple implementation of an SVN::Delta::Editor subclass |
SVN-Mirror | Mirrors a remote Subversion, CVS or Perforce repository |
SVN-Push | Another subversion mirroring tool. This one claims to be based on SVN-Mirror but pushes a subset of repository A onto repository B – is this importing? committing? |
SVN-Pusher | Yet another subversion mirroring tool. |
SVN-Deploy | Configures and manages a subversion repository as a deployment database. A deployment database stores rules for extracting code from a codebase and compiling it into a distribution for use in testing and deployment. Information about each product version is stored as a set of subversion properties. |
Occasionally there are times when the subversion libraries cannot be installed. For example, subversion and the web server might be using different versions of the Apache Portable Runtime Library. For this reason, some modules provide the option of using either the subversion binary or the subversion API. They include:
SVN-Log
SVN-Churn – based on SVN-Log
SVN-Log-Index – also based on SVN-Log
Log-Accounting-SVN – also based on SVN-Log
In addition, CPAN has several distributions using the subversion binaries exclusively:
CPAN distribution | Description |
SVN-Agent | Object oriented wrapper around the svn binary – each of the subcommands – add, commit, ls, cat, diff, etc correspond to methods on this object. |
SVN-Class | Subclasses Path::Class to allow programatic access to subversion client side commands. Path::Class is a kinder gentler wrapper around File::Spec. |
SVN-Look | object oriented wrapper around Subversion's svnlook binary |
SVN-SVNLook | Another wrapper around Subversion's svnlook binary |
App-SVN-Bisect | Tool to search for a revision containing specific attributes, e.g. a text string or function name. Used most often to find out when a feature or function was added to a repository. Uses svn binary. |
App-SVNBinarySearch | Another implementation of binary search using the svn binary. |
CPAN also provides a number of Subversion modules that neither use the API nor binaries. Most of these distributions fall into one of four categories:
security – tools for generating and editing authz and other configuration files affecting repository security.
dump files – tools for parsing and editing dump files. A dumpfile is a portable format for repository's deltas. Certain maintenance actions (e.g. completely removing any trace of a file can only be done by editing the delta. Editing deltas is also the only way to split up a single repository into several subrepositories.
binary diff files – tools for parsing files describing the difference between two revisions.
hooks – subversion lets one define scripts that execute custom actions before and after commits and other repository actions. These subversion distributions provide tools to manage those hooks. Most of the distributions customize the behavior of the post-commit hook using SVN::Notify as a base.
application integration - tools for integrating subversion access with third party software, e.g. Request Tracker
public repository access – tools for accessing custom public repositories on the web, e.g. the OpenSVN.csie.org repository.
CPAN distribution | Description |
SVN-ACL | sets up subversion repository security using YAML files |
SVN-Access | a command line and object oriented interface for maintaining subversion repository security. |
SVN-Dump | Parses a subversion dump file |
SVN-DumpReloc | Script to rewrite paths in a dump file |
SVN-Dumpfile | Yet another parser and editor for subversion dump files |
SVN-Dumpfilter | Edits a dump file using a visitor pattern. |
Parse-SVNDiff | Parser for subversions binary diff format |
SVN-Hook | a utility for managing the hooks used by a subversion repository. To use this module you must plan to use SVN::Hooks from the get-go. It can used with existing repositories, but only if they have not yet defined any hooks. |
SVN-Hooks | A perl library of composable hooks for use with SVN::Hook managed repositories |
SVN-Notify | A script that can be called from within a subversion repository post-commit hook. The script relies on sendmail and so is likely limited to use on *nix style systems. |
SVN-Notify-Config | YAML based configuration of the SVN-Notify script |
SVN-Notify-Mirror | Mirrors committed svn directories. This is implemented as an action within a post-commit script managed by SVN-Notify. |
SVN-Notify-Snapshot | Enables a post-commit hook managed by SVN-Notify to take a snapshot of a recently committed subversion directory. |
SVN-Notify-Filter-AuthZMail | Tells SVN-Notify where to send commit notifications by reading the authz file. |
SVN-Notify-Filter-EmailFlatFileDB | Uses a flat file database to generate email addresses for SVN-Notify |
SVN-Notify-Filter-Markdown | Uses a file in Markdown format to generate email addresses for SVN-Notify |
SVN-Notify-Filter-Watchers | Uses Subversion properties to generate email addresses for SVN-Notify |
RT-Integration-SVN | Integrates subversion with Request Tracker |
WWW-OpenSVN | generates http requests for projects stored at OpenSVN.csie.org. OpenSVN is a free subversion repository sponsored by students at National Taiwan University |
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: RFC: Automating subversion with perl
by roboticus (Chancellor) on Jan 23, 2009 at 03:17 UTC | |
by ELISHEVA (Prior) on Jan 23, 2009 at 03:46 UTC | |
|
Re: RFC: Automating subversion with perl
by Zen (Deacon) on Jan 28, 2009 at 14:44 UTC | |
|
Re: RFC: Automating subversion with perl
by Anonymous Monk on Mar 10, 2009 at 11:53 UTC |