CocoaDev

Edit AllPages

A regular expression is:

Also known as a regex or regexp.

[Topic]


References:


Software:


I have looked into a few RegularExpressions libraries for use in my program, but all the ones I have used are rather slow. I’ve resorted to my original way of piping data to a Perl script which handles the regular expressions and returns it back:

#!/usr/bin/perl

script for removing HTML

$str = “”; while ($line=) { $str .= $line; } $str =~ s/<(?:[^>'"]*|(['"]).*?)*>//gsi; print $str; If anyone knows of a **fast** way of doing regular expressions that is thread safe, please let me know!

Maybe not as easy to work with, but you could use flex to generate the scanner (I _think_ the experimental C++ output is thread safe). There are tools similar to flex, i.e. which generate a DFA-based matcher, and thus are much faster than the misc. regexp libraries – as for libraries, perl is probably the most optimized one!

P.S. To do the above (with perl) you can call (can’t disable the italic, so view source):

perl -pe ‘s/<(?:[^>’*”]* ([’*”]).?)>//gsi’

I don’t know if it’s any faster than what you’ve already tried, or even usable, but NSString has some undocumented regexp functions.


There is a NSStringRegExp addition: http://homepage.mac.com/jrc/contrib/


Note that this addition is not unicode-safe; in particular, the returned ranges are incorrect for strings containing non-ASCII characters, which will probably result in thrown exceptions as the substring operations fail.

Another wrapper around the regexec from C is also available at: http://www.spikesoft.ch/?p=24


No need for an external solution to test a NSString object against a regexp! Regular Expressions for NSString http://www.stiefels.net/2007/01/24/regular-expressions-for-nsstring/


Unfortunately in many cases being able to test only for matching without being able to get the location or contents of match groups is as close to useless as makes no difference.