[ Prev ] [ Index ] [ Next ]

This document describes the unix stream editor, implemented as sed(1) and
provides examples.

1. Sed and perl regex

	Sed uses perl regular expressions, which include the perl character
	classes.
	A. Character classes
	    Character classes are used in perlre and are similar to quoted
	    characters like "\s" for matching whitespace. Character classes
	    are enclosed in "[::]" (like [:space:]) and can be used just like
	    any normal character. Character classes can themselves appear in
	    a character class list. For example, consider the difference in
	    the following 2 re's:
	'[,[:space:]]'  - match whitespace and comma, in any order
	',[:space:]'    - match a comma followed by whitespace
	B. Man pages
	    For more info on the regular expressions used by sed (perlre), see
	    the following man pages:
	perlunicode - For details about unicode and for details on  "\pP",
	              "\PP", and "\X" (e.g., "\x{85}",  "\x{2028}",
	              "\x{2029}"
	perluniintro - Unicode in general.
	perllocale - Localization, which affects, for example, the
	             list of alphabetic characters generated by "\w".

2. Eric Pement's "One-Liners For sed"

	The following sed document (Pement 2004) contains some pretty useful sed 
	one-lines. See (local) content in #sed1line.txt or the web url at
	http://www.student.northpark.edu/pemente/sed/sed1line.txt

3. Multiple expresions

Sed can parse multiple regular expressions and apply them to it's input stream. This is useful, for example, when removing text from the beinging and end of lines in the input stream. Consider an input file foo.txt, with the following content:

	SOL This is line 1 EOL
	SOL This is line 2 EOL

One way to remove the SOL and EOL symbols is to pass the contents of the file through an invocation of sed with 2 expressions, one for removing hte SOL symbol and the other for removing the EOL symbol. The following sed invocation does exactly that:

	bash $ cat foo.txt | sed 's@^SOL\W@@;s@\WEOL$@@'  

Backlinks: