CocoaDev

Edit AllPages

Announcing RegexKit - A framework for regular expressions using the PCRE library.

Edited on 01/29/2008 to reflect version 0.6.

Version 0.6 Beta has been released.

New features in 0.6:

The project is hosted at sourceforge, and you can access it via:

http://regexkit.sourceforge.net/

The framework documentation is available at:

http://regexkit.sourceforge.net/Documentation

The sourceforge project page is:

http://sourceforge.net/projects/regexkit/

I would appreciate feedback from experienced cocoa developers regarding the API and usability of the framework.

Some highlights:

Garbage Collection is completely optional, like the rest of Cocoa. It is dynamically activated at load time if the Cocoa NSGarbageCollector is active. Otherwise the standard retain / release style of managing memory is used.

The 32 bit architectures (ppc, i386) require Mac OS X 10.4 or later, while the 64 bit architectures require Mac OS X 10.5 or later.

The frameworks documentation is now integrated with Xcode 3.0, just like the rest of the Apple documentation. With the Research Assistant open, placing the editors insertion point over a RegexKit method will bring up that methods documentation, including related methods. You can click on any of these and Xcode will open the corresponding page in the documentation viewer. The documentation viewer also offers full keyword searching of the entire documentation.

As an example of that’s possible with the NSString additions, consider the following: Find, extract, and convert to an unsigned int a hex value from a NSString. Here’s how you can do it with RegexKit:

unsigned int hexValue = 0;

[@”Conversion color: 0xFF0000, Order: 1” getCapturesWithRegexAndReferences:@”color: (0x[0-9a-fA-F]+)”, @”${1:%x}”, &hexValue, NULL]; // hexValue == 16711680 || 0xFF0000

getCapturesWithRegexAndReferences: allows you to easily match a capture subpattern from a regular expression and perform a scanf() style conversion. Also note that there is no need to create a regular expression object, the framework will automatically convert NSString objects to RKRegex objects for you. In fact, getCapturesWithRegexAndReferences: will accept either an instantiated RKRegex object, or a NSString which it will convert.

You can also create new strings with references to regular expression matched text:

NSString *newString = [@”Doe, John” stringByMatching:@”(?\\S+),\\s*(?\\S+)" withReferenceString:@"Dear ${first} ${last},\n\nHow are you today, ${first}?"]; /* newString == Dear John Doe,

How are you today, John? */

There are similar search and replace methods as well.

With this alpha release, I’m looking for comments from objective-c cocoa developers regarding the API. Any other comments are welcome as well. I’m in the final push to get a “1.0” general release done, so I’d like to freeze the features of the framework and concentrate on getting a polished release out the door. As I mentioned, there’s some conflicting information in the current release regarding first-time user related information, such as references to an old “cli_test*” target/code. I don’t expect any experienced cocoa developer to be thrown off, though. The documentation regarding adding the framework to your project (should?) be just fine, just no examples yet.


How does this compare to OgreKit? –boredzo


I will try to keep this as unbiased as possible, and I encourage others to make any changes necessary to keep it objective. That said, as a full disclosure, I am the author of RegexKit. I’m probably the only person who can speak for it right now, since it has just been released. I also know very little about OgreKit. Take the following with however large a grain of salt you feel is warranted.

Regular Expression Engine

RegexKit

OgreKit

License

RegexKit

OgreKit

Overall

RegexKit

OgreKit

RegexKit strengths

During development, a strong effort was made to keep the framework light weight and fast. No modifications are made to the regular expression engine in terms of performance, instead the tuning effort was focused on keeping the overhead to enable object-oriented access to pcre at a minimum. Examples include:

RegexKit wild claims

RegexKit weaknesses

OgreKit strengths

Sorry, I’m just not familiar with OgreKit to write up anything comprehensive here.


A lot of regex classes provide ways to get an array of pattern catches. I see that RegexKit allows for named catches and “${1}” captures but what if I don’t know how many captures a regex will produce? Is there a way for RegexKit to do this?

Answering my own question:

@interface NSString (RKAdditions)

@implementation NSString (RKAdditions)

This works fairly well so far. The first capture is the entire matched string. Anyone see any improvements? -G


One improvement: If rangesForCharacters is NULL then don’t iterate through the ranges. -hac


Is there any recommended application to generate / test on input Regex complient with [[RegexKit ?


There is now; Voila: http://atastypixel.com/blog/reginald-regex-explorer/