Matching Regular Expressions that don’t end with…

Regular expressions do not mix well with syntax that requires memory, such as XML. I was trying to add a <br /> tag to every line that did not have a </p> tag. so for example I can print the strings I want with grep -v ‘^.*</P>’

Anyway, this turns out to be a bear, because (?!expression) just isn’t working for me with sed, although I think google says it should.

So what do I do? I make two!

s/(w.*</p>)/1<br />/g

adds a <br /> to every line with a word.

s/(.*</p>)<br />/1/g

take off the <br /> for lines that have a </p>

The good thing about this is it should work with all standard regular expressions, unlike that look ahead stuff which may only work with certain utilities.

You could run this with sed, vim, perl, python, whatever, and it will work.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s