Monday, August 11, 2008

Sed (Stream Line Editor)

The slash as a delimiter
Say /usr/local/bin to /common/bin - you could use the backslash to quote the slash:
sed 's/\/usr\/local\/bin/\/common\/bin/' new
Gulp. Some call this a 'Picket Fence' and it's ugly. It is easier to read if you use an underline instead of a slash as a delimiter:
sed 's_/usr/local/bin_/common/bin_' new
Some people use colons:
sed 's:/usr/local/bin:/common/bin:' new
Others use the "|" character.
sed 's|/usr/local/bin|/common/bin|' new


Using \1 to keep part of the pattern
To review, the escaped parentheses (that is, parentheses with backslashes before them) remember portions of the regular expression. You can use this to exclude part of the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern. Sed has up to nine remembered patterns.

If you wanted to keep the first word of a line, and delete the rest of the line, mark the important part with the parenthesis:
sed 's/\([a-z]*\).*/\1/'

The "\1" doesn't have to be in the replacement string (in the right hand side). It can be in the pattern you are searching for (in the left hand side). If you want to eliminate duplicated words, you can try:
sed 's/\([a-z]*\) \1/\1/'
You can have up to nine values: "\1" thru "\9."

"[^ ]*," - matches everything except a space.

This next example keeps the first word on the line but deletes the second:
sed 's/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /' new

No comments: