Troubleshooters.Com
Presents
|
Linux Productivity Magazine
Volume 1 Issue 5, December 2002
VI and Vim
|
Copyright (C) 2002 by Steve Litt. All rights reserved.
Materials from guest authors copyrighted by them and licensed for perpetual
use to Linux Productivity Magazine. All rights reserved to the copyright
holder, except for items specifically marked otherwise (certain free software
source code, GNU/GPL, etc.). All material herein provided "As-Is". User
assumes all risk and responsibility for any outcome.
[ Troubleshooters.Com
| Back Issues ]
In pursuit of the dubious goal of producing
idiot-proof, zero-learning-curve programs, even programs intended for heavy-duty
use such as editors--arguably the most important piece of software you'll
use--have been turned into children's toys, effectively expert-proofed.
-- Tom Christiansen
(discussing alternatives to VI in his article, "Zenclavier: Extreme Keyboarding")
|
CONTENTS
Editor's Desk
By Steve Litt
I hated VI the first time I used it. So to avoid using VI, I created a script
to ftp a file from the HP9000 down to my Win98 box, edit it with a Windows editor,
and ftp it back up. VI's arcane and non-intuitive keystrokes, plus its silly two mode construction were productivity killers.
A couple years later I started using Linux, and naturally chose a different
editor -- the Wordstar like Joe. That might have continued forever if Dillon
Jones hadn't bragged about Vim, the VI workalike common on Linux boxes.
Dillon was a fellow LUG member, and every chance he got he showed off a new
script, function or trick in Vim. He had a script to convert Vim to an HTML
authoring tool. With a few keystrokes he could convert a directory listing
into a data file or a computer program. I decided to reinvestigate VI.
And fell in love. I've used all sorts of editors, but VI is by far the fastest
and most powerful. And this fact is one of the world's best kept secrets.
All too many people draw the conclusion I originally drew -- that VI was
some silly kludge hacked together by 1970's spaghetti programmers.
It's an easy conclusion to draw. Who would guess to press "l" to go right.
Shouldn't "l" go left? What in the world is the rationale for using "h" to
go left? One would think "h" would mean "higher" or some such thing. "j"
and "k" for down and up respectively is just plain weird. And how strange
is the 2 mode system, where the insert mode is for typing, and the command
mode is for editing, necessitating constant switching between them. Many
VI implementations cannot even facilitate a mouse, nor can they highlight
text with the Shift+Arrow hotkey, nor skip words with Ctrl+Arrow. As a modern
software developer I've been trained to make my apps intuitive.
But anyone who has been around the block a few times knows there's a tradeoff
between beginner's intuitiveness and master's productivity. VI has been optimized
entirely for the touch typist. l, h, j and k are easily touch typing home
position accessible. Cursor (arrow) keys are not. In the time you can move
your hand from the keyboard to the mouse, a VI master can cut the current
paragraph and move it above the preceding paragraph. (Keystrokes {d}{P). VI is built exclusively for speed. Well, almost exclusively.
VI is built for power too. By combining VI's range enabled global commands,
range enabled search and replace commands, regular expressions, macros and
scripts, you can effect almost any conceivable transformation on a
file. If the transformation can be described in English, it can be
done with VI. And in VI it usually can be done in 1/10 the time required
to write a program to perform the same transformation.
Speed and power. Life can't get any better.
OH YES IT CAN! You can also have ubiquity. Every system with UNIX, a UNIX
variant or a UNIX workalike operating system has VI installed. Once you're
good at VI, you can walk into any UNIX installation and get to work, without
the need to learn a new editor or install your favorite editor. And once
you're really good at VI, it will probably become your favorite editor.
Speed, Power and Ubiquity. This issue of Linux Productivity Magazine details
everything you need to know to achieve top notch speed and power with this
ubiquitous editor. Also discussed is the ultra-powerful Vim superset of VI.
Vim has a programming language all its own, including functions. Vim is so
powerful that a professional grade outline processor, called VimOutliner,
has been created using a few Perl scripts and some Vim scripts. One article
of this magazine is devoted to VimOutliner.
So kick back, relax, and learn how to make VI, Vim and VimOutliner work their
magic. And remember, if you use Open Source or free software, this is your
magazine.
The Accuracy of This Document
By Steve Litt
The biggest problem in writing this magazine is the fact that there are many
different implementations of VI, and 98% of my VI use has been with the Vim
implementation (HP9000 UNIX VI and Elvis make up the other 2%). So I'm sure
some of the features I attribute to VI are really Vim specific.
I had a choice to make with my limited time. The magazine could describe
a tiny subset of VI functionality and rigorously check various VI versions,
or it could document a much larger functionality set while I tried my best
to remember whether other VI implementations support the various features.
I chose the latter. First, Vim is probably the most used VI implementation.
And second, most readers can forgive a few described features not available
in their VI implementation in order to learn VI's powerhouse techniques.
So if your VI doesn't support a particular feature (the \| alternative in
regular expressions, for instance), try to find another way to do it (macros,
for instance).
GNU/Linux
By Steve Litt
GNU/Linux is comprised of the Linux kernel originally crafted by Linus
Torvalds, plus many, many utilities, a large number of which were utilities
from the original GNU project. "GNU/Linux" is probably the most accurate
moniker one can give to the operating system. Please be aware that in all
of Troubleshooters.Com, when I say "Linux" I really mean "GNU/Linux".
I completely believe that without the GNU project, without the GNU Manifesto
and the GNU/GPL license it spawned, the operating system the press calls
"Linux" never would have happened.
I'm part of the press and there are times when it's easier to say "Linux"
than explain to certain audiences that "GNU/Linux" is the same as what
the press calls "Linux". So I abbreviate. Additionally, I abbreviate in
the same way one might abbreviate the name of a multi-partner law firm.
But make no mistake about it. In any article in Linux Productivity
Magazine, in the whole of Troubleshooters.Com, and even in the technical
books I write, when I say "Linux", I mean "GNU/Linux".
There are those who think FSF is making too big a deal of this. Nothing
could be farther from the truth. The GNU General Public License is the only
reason we can enjoy this wonderful alternative to proprietary operating systems,
and competition from Free Software is the only reason proprietary operating
systems aren't even more flaky than they are now. Last but not least, it's
significant to note that in Infoworld's October 6, 2000 article entitled
"E-business innovators", author Mark Leon named GNU's Richard Stallman as
the innovator associated with GNU/Linux.
Thanks
By Steve Litt
We all stand on the shoulders of those who came before us, and this Linux
Productivity Magazine is a perfect example. Certainly Ken Thompson deserves
thanks for creating UNIX, the original platform on which VI originated. Had
Richard Stallman not created the GNU Manifesto and GNU GPL license, GNU/Linux
wouldn't exist and I'd still be using PFE shareware on Windows. For the same
reason, Linus Tovalds must be thanked.
Thanks to Bram Moolenaar for creating the Vim implementation of VI. Vim is
standard on all Red Hat and Mandrake systems, and probably many others.
Vim is one of VI's greatest ambassadors, running on Linux, Windows, Mac,
OS/2, UNIX, and even the Amiga. And Vim's feature set is so great that, frankly,
I'm spoiled.
Thanks to the entire Open Source community for making such great software.
And a big thankyou to my fellow VimOutliner developers, Noel Henson and Matej
Cepl, for taking VimOutliner from an "army surplus" app to a full featured
outline processor.
VI Life Preserver
By Steve Litt
This article is designed for the person who has never
before used VI. Specifically, the newbie using this article won't get caught
in a situation where he can't get out of the program. And he'll be able to
open and save files, maneuver around the file, and perform basic edits. The
person familiar with this article can go to any VI equipped computer and edit
files, although such editing won't be particularly efficient. Subsequent
articles bestow efficiency.
Getting In and Out of VI
Getting into VI is easy. At the command prompt, just type vi and press Enter. Or, if you want to use VI to edit a specific file called myfile, at the command prompt type vi myfile and press the Enter key. Once in VI, the following commands are available to retrieve files, save files, get help and get out:
Esc
|
Gets you out of insert mode. Commands must be performed
from command mode, not from insert mode. Pressing Esc while in command mode
does nothing and is harmless. Be sure you're in command mode before trying
the following commands. When in command mode the cursor is fat and covers
one character. When in insert mode the cursor is skinny and is between two
characters.
|
| :q! |
Gets you out of VI no matter what. No files are saved. |
:wq
|
Saves the file and quits VI.
|
:w
|
Saves the file and keeps you in the VI editor.
|
:q
|
Quits VI if the file is up to date. If there are unsaved changes you are kept in VI.
|
:e myfile
|
Places file myfile in the VI editor
|
:h
|
Goes into help mode. You get out of help mode with the :q command.
|
:h myfeature
|
Goes into help mode and looks up any help on feature myfeature. If there's no help on myfeature, it errors out and doesn't go into help mode.
|
The preceding commands are all you need to get into VI, get back out, and edit a specific file.
Moving Through the File
Before you can do any real editing, you must be able to move through the
file. The following is a small subset of the commands with which you can
move through the file. Remember once again you must be in command mode to
run these commands. You can move from insert mode to command mode with the
Esc key.
l (lower case L)
|
Cursor one character right
|
h
|
Cursor one character left
|
j
|
Cursor one line down, staying in the current column.
In other words, if you were on the 14th character of line 5 when you pressed
j, the cursor would end up on the 14th character of line 6. If line 6 has
less than 14 characters, the cursor will be on the last character in line
6.
|
k
|
Cursor up one line. The k command works the same as j only up instead of down.
|
w
|
Cursor one word right. This command wraps at the end of a line.
|
b
|
Cursor one word left. This command wraps at the beginning of a line.
|
G
|
Cursor to end of file
|
gg
|
Cursor to start of file. On VI implementations without the gg command, use 1G
|
100j
|
Cursor down 100 lines. Naturally, that number can be any number you choose. You can also precede k with a number to move up that many lines. This same technique can be used to move several characters (100l or 100h), or several words (100w or 100b).
|
Ctrl+G
|
Tells you your current line number, so you can estimate how far you need to jump or move.
|
52G
|
Cursors to line 52 from anywhere in the file. Naturally,
that number can be any desired destination line. Note that on some implementations
you can do this as 52gg, which might be easier for some keyboarders.
|
Ctrl+F
|
Move down (forward) one screenful of lines.
|
Ctrl+B
|
Move up (backward) one screenful of lines.
|
Ctrl+D
|
Move down 1/2 screenful of lines.
|
Ctrl+U
|
Move up 1/2 screenful of lines.
|
The preceding is a list of basic, essential moves. There are many more. In Vim, use the :h motion.txt command to learn the rich variety of motion commands. And there are moves based on searches.
The next section of this article covers basic searches.
Basic Searches
This section covers VERY basic searches. VI searches can be incredibly powerful and complex, because they incorporate regular expressions.
This article does not cover search and replace, because in VI search and
replace is extremely powerful, and is implemented differently from plain
searches.
:set ic
|
Sets VI to ignore case in all searches and replaces. All subsequent searches are case insensitive.
|
:set noic
|
Sets VI to consider case in all searches and replaces. All subsequent searches are case sensitive.
|
/Steve
|
Search for the next instance of Steve in the file. In other words, search down from the cursor.
|
?Steve
|
Search for the previous instance of Steve in the file. In other words, search up from the cursor.
|
/web *master
|
Search for webmaster, web master, or web master.
This is a very basic regular expression that means "look for web, followed
by zero or more spaces, followed by master. Such regular expressions also
can be used with the ?
to search backwards. The asterisk means "zero or more of the preceding character",
and the preceding character in this case is a space.
|
/
|
Search down for the next occurrence of whatever you searched for last.
|
?
|
Search up for the next occurrence of whatever you searched for last. |
n
|
Repeat the previous search, in the direction of the previous search.
|
N
|
Repeat the previous search, in the opposite direction of the previous search.
|
As an added benefit, many VI implementations including the Vim VI implementation
give you a search history. Press the slash key (/) followed by the
up arrow, and you'll see the history of all your searches, both forward and
backward. If you want to search backwards for something in your history,
press the question mark (?) and then the up arrow. If you go too
far with the up arrow, you can return with the down arrow. If you're familiar
with the command history provided by the bash shell, it's very similar.
This section has provided you with a minimal set of search commands with which you can maneuver the file.
Inserting Text
There are probably 20 commands to insert text, and the truly efficient VI
user knows and regularly uses most of them. However, only a few are necessary.
Strictly speaking, you could do everything with the i command (Insert
before the current character). You could navigate to the character before
which you want to insert, and the press i key. Use of only the i key would preclude easily inserting at the end of a line. What you'd need to do is navigate to the end of the line, press the i
key, insert the character previously at the end of the line, insert text
you want to insert, return from insert mode, and then delete the extra end
character. Ugh!
That's why VI has the insert after (a) command, which inserts after the current key. So you'd maneuver to the end of the line, press the a key, and type what you want at the end of the line.
Here is a summary of enough insert commands to make your life easy:
Esc
|
Return from insert mode to command mode.
|
i
|
Insert before the current character. Observe that this gives you no convenient way to insert at the end of the line.
|
a
|
Insert after the current character. Observe that this gives you no convenient way to insert at the beginning of the line. |
I
|
Insert at the beginning of the line.
|
A
|
Insert at the end of the line. This is needed often and is a real time saver.
|
o
|
Open a brand new empty line below the current one, and position the cursor at the start of that line, in insert mode.
|
O
|
Open a brand new empty line above the current one, and position the cursor at the start of that line, in insert mode. |
The preceding set of commands is by no means exhaustive, but it's enough to work with some efficiency in VI.
Deleting Text
VI gives you two ways to delete -- You Asked For It,
You Got It (YAFIYGI) and What You See Is What You Get (WYSIWYG). Each has
its benefits. In general, go YAFIYGI for small deletions, and WYSIWYG for
large or complex ones. Obviously, WYSIWYG is safer, and YAFIYGI is often
faster. I use both regularly. Remember also that almost all VI implementations
have at least one level of undo, so it usually isn't the end of the world
if you blow a delete.
You Asked For It, You Got It
Here are some of the most used YAFIYGI commands:
u
|
Undo the previous command. Very handy to correct a
blown delete. Many VI implementations have multiple levels of undo. Some
such implementations use a succession of u commands to do more undos, and
Ctrl+R commands for redos in case you undo too far. Other implementations
use u for the first undo, and Ctrl+R commands for subsequent undos. The latter,
although bizarre, is the "true VI way". Your best bet is to be alert enough
that you never need more than a single level of undo.
Note that if you save often, you can always revert to the last saved version.
|
:e!
|
Revert to the last saved version of the current file.
|
x
|
Delete the character under the cursor. The cursor
comes to rest on the character following the deleted one, or if the deleted
character was the final character on the line, the character before the deleted
one.
|
X
|
Delete the character to the left of the cursor. The cursor stays on the current character.
|
| dd |
Delete the current line so that the cursor is placed on the line following the one deleted. |
| D |
Delete current character and everything to the end of the current line. |
d^
|
Delete everything to the left of the cursor.
|
dG
|
Delete the current line and everything below. (Careful!)
|
dgg
|
Delete the current line and everything above. (Careful!) |
What You See Is What You Get
WYSIWYG deletes are safer because you can review what you'll do before you
do it. Here are the commands to perform WYSIWYG deletion:
V
|
Begin highlighting lines of text. Use j, k, or other move commands discussed previously to extend the highlighting.
|
d
|
Delete the highlighted text
|
Summary
VI is an incredibly rich and powerful editor. There's no way to learn the
whole thing at once. But by mastering how to start VI, how to quit it, how
to load, save and quit files, and a few commands to move around, insert text
and delete text, you'll be able to get into the game. By mastering the contents
of this article, you'll be able to walk into any UNIX shop and use the editor
without looking like a total newbie.
That being said, VI offers MUCH more. Its power is almost magical. Read on...
Search and Replace: Substitute and Global Commands
By Steve Litt
This article reviews the mechanics of the search and replace commands: Substitute
commands, Global commands, and combinations of the two. To keep it simple,
this article contains no regular expressions, so naturally all its examples
are academic rather than real-world. An article later in this magazine details
regular expressions, which, when combined with the information in this article,
gives you complete power over a file.
The simplest search and replace occurs on a single line. For instance, to
replace the first instance of "Windows" with "Linux", use the following command:
:s/Windows/Linux/
The colon puts you in EX mode, which is what you use for search and replace, as well as global commands and many other commands.
|
NOTE
In some VI implementations, the EX mode has a history feature similar to the bash shell. After pressing
the colon character, you can use the up arrow to recall previous commands,
and the down arrow to go back to later commands if you go too far back.
|
The s stands for "substitute", because that's what you're doing. The first
forward slash signifies the beginning of the search expression. The second
backslash signifies the end of the search expression and the beginning of
the replace expression. The third backslash signifies the end of the replace
expression.
|
NOTE
In
many VI implementations you don't need to use the slash character as the
expression delimiter. You can use most non-alphanumeric characters (but not
\, " or |). This is very handy when working with UNIX filenames, as in the
following example:
:s+/usr/local/+/opt/+
Whatever character follows the :s is defined to be the delimiter character.
If your implementation doesn't support this, you can represent slashes in
search and replace expressions by escaping them with backslashes, as follows:
:s/\/usr\/local\//\/opt\//
As you can see, the escaping method is much less readable, so if you can use alternative delimiter characters, it's a good idea.
|
Recalling the substitute command:
:s/Windows/Linux/
The preceding simple command replaces only the first instance on the line.
Sometimes that's what you want, and sometimes it isn't. Often you want to
replace all occurrences on the line. In that case, append the letter g (stands
for Global) after the close of the replace expression, as follows:
:s/Windows/Linux/g
Perhaps you're replacing ten or so instances, and you want to make sure you
really want to replace each one. In most VI implementations you can use the
letter c (stands for Confirm) at the end, in which case before each replacements
you'll be asked yes or no:
:s/Windows/Linux/gc
Ranges
So far we've discussed search and replace on a single line, which is pretty
useless because on a single line it might be faster to search and replace
manually. The real power comes when you search and replace over a range
of lines. You declare the range before the s command. By far the most common
range is "all lines in the file", which is represented by the percent sign,
as follows:
:%s/Windows/Linux/gc
Due to the percent sign before the s, the preceding replaces every instance
of "Windows" with "Linux" throughout the entire file. Please remember, the
g on the end means "every instance on the line", and the percent sign at
the beginning means "every line in the file", so combined they mean "every
occurrence on every line", which works out to "every occurrence in the file".
Either the % and g can be used without the other, or they can both be used
together.
You can use other ranges, as described in the following table:
Range
|
Example
|
Explanation
|
4,7
|
:4,7s/Windows/Linux/
|
Runs the substitution command on lines 4 through 7 inclusive.
|
4;7
|
:4;7s/Windows/Linux/ |
Runs the substitution command on line 4, and the 7
lines following it. Note the distinction between the semicolon version, which
calculates the second value relative to the first.
|
4
|
:4s/Windows/Linux/ |
Runs the substitution command on line 4 only.
|
4,$
|
:4,$s/Windows/Linux/ |
Runs the substitution command on lines 4 through the
file's last line, inclusive. In line ranges, the $ character represents
the last line in the file. |
4,.
|
:4,.s/Windows/Linux/ |
Runs the substitution command on lines 4 through the line the cursor is on, inclusive.
In line ranges, the . character represents the current line. |
.,2000
|
:.,2000s/Windows/Linux/ |
Runs the substitution command on the current line
through line 2000 inclusive. This shows that the dot can be used as the beginning,
or as the end of a range. |
.,.+3
|
:.,.+3s/Windows/Linux/ |
Runs the substitution command on the current line
through the line three below the current line inclusive. In other words,
this line plus the next three. In general, you add and subtract from a range
value by appending a plus and number, or a minus and number. Be careful not
to go too far or too short by one -- it's easy to do. |
| .,+3 |
:.,+3s/Windows/Linux/ |
CAREFUL!!! This produces the identical result to the
preceding example, even though we removed the dot in the range end expression.
In a + or - expression, if what it's relative is not explicitly declared,
it defaults to "current line". This can lead to confusion. Use the version
with the dot.
|
/My_Opinion/,$
|
:/My_Opinion/,$s/Windows/Linux/ |
Search forward for the next occurrence of My_Opinion,
and run the substitution on that line through the end of file, inclusive.
Note that the search will not find matches on the current line. |
Searches in Ranges
The final example in the preceding table shows a search in a range. This
makes VI incredibly powerful, and if done casually incredibly dangerous.
But here let's discuss the power. A search can appear in either the range
start expression, the range end expression, or both. The search can be forward
or backward. The line found by the search can be added to or subtracted from.
|
WARNING
If you have wrapscan set, searches in ranges can do some strange things.
A forward search can fail to find the string by end of file, wrap back up,
and find one above where you are. Likewise, a backward search can wrap around
the top and find a match below you. These typically are not what you want.
To make your results more predictable, before running commands with searches
in ranges, turn off wrapscan with the following command:
:set nowrapscan
Later, if you want to restore wrapscan so you can find a pattern anywhere in the file, restore it with this command:
:set wrapscan
|
Remembering to set nowrapscan, see the following examples:
Range
|
Function
|
Explanation
|
| :/My_Opinion/,$s/Windows/Linux/ |
Search forward
|
Search forward from the current line to the next line
containing "My_Opinion", and from that line to the end of file substitute
"Linux" for "Windows". The search WILL NOT match any "My_Opinion" on the
current line, and instead will match the next downward occurrence of "My_Opinion". |
| :?My_Opinion?,$s/Windows/Linux/ |
Search backward
|
Search backward from the current line to the next line up containing "My_Opinion",
and from that line to the end of file substitute "Linux" for "Windows". The
search WILL NOT match any "My_Opinion" on the current line.
The search WILL NOT match any "My_Opinion" on the current line, and instead
will match the next upward occurrence of "My_Opinion". |
| :1+/My_Opinion/,$s/Windows/Linux/ |
Search forward from line 1
|
The previous searches started from the current line,
which may or may not be what you want. If starting the command with a known
state is important to you, use this command form to start from line 1. Note
that this will not find a match on line 1, so if that's an issue you'll need
to run an additional command to get the lines you missed.
|
:2+?My_Opinion?;/My_Opinion/-1s/Windows/Linux/
|
See comment
|
If the first line contains "My_Opinion", the preceding
command did not substitute on that line or any lines until the next occurrence
of "My_Opinion". You probably want those lines substituted. This command
does that. Starting at line 2, it goes backward (you did set nowrapscan to
prevent search wrapping, didn't you?), and if it finds "My_Opinion" on line
1, it alters line 1 and every line until, and including, the line before
the next "My_Opinion", which presumably was fixed by the preceding command.
|
WYSIWYG Ranges
Consider the preceding range substitution:
:2+?My_Opinion?;/My_Opinion/-1s/Windows/Linux/
Can you glance at that command and tell what it does? Not unless you're a true VimMeister. Personally, I'm not that good.
The classic early 1980's post titled "Real Men Don't Use Pascal" discusses
the wonders of "You asked for it, you got it" (YAFIYGI) editors like DEC's
TECO, in which a single wrong character can destroy the file. The preceding
command just might involve just a little too much "you asked for it, you got it" risk for
someone on a tight deadline.
When you need to play it safe, many VI implementations have something called
Visual Mode, in which you can highlight a certain range of lines, and run
a simple substitution on that range. For instance, in the Vim implementation,
you'd go to the first line you want to change, press Shift+V, then use motion
commands, such as j or a search, to move to the end of the range. Once the
desired range is highlighted, you simply press the colon key, and then type
s/Windows/Linux/ to make the change in the highlighted area.
Generally speaking, when you're making range changes in realtime, if you
have Visual Mode available, you'll use it to simplify ranges.
And of course, when you want to show off at your LUG, you use the YAFIYGI
method. After all, real men use TECO when they can, and if TECO isn't available,
they use VI recklessly.
Global Commands
Real men and quiche-eating Pascal programmers alike love VI's Global commands.
The real man can destroy an entire file with this 8 character command:
:g/the/d
Or he can hide his tracks from quiche-eating Pascal Programmers by combining
it with an obfuscatory substitute command like this:
:1+/friend/;/lover/g/Madonna/+2s/singer/singer,actress/
If you're a Pascal Programmer (or the modern equivalent, a Java Jockey) interested in knowing what the preceding command does, click here. To real men the preceding is intuitively obvious to the most casual observer.
Quiche eaters also love Global commands. When the boss expects the huge production
produced by quiche-eating rapid development techniques, your editing needs
to be lightning quick. Consider how often you need to pick out only lines
containing "whatever" from a huge file. Edit a duplicate copy of the file
with VI, and run the following command:
:g!/whatever/d
The preceding command deletes every line not containing "whatever". But what
if you need only lines containing "whatever" and "whichever"? Watch this:
:g/whatever/.w>>w.txt
:g/whichever/.w>>w.txt
Try it. Create a data file with lines consisting of either "whatever", "whichever"
or "whyever". Then, so you can be assured that all lines are copied in the
order they're found, prepend line numbers to every line with the following
command:
:%!cat -n
Global commands are also an excellent way to perform a substitution on only
a certain class of lines. The following changes the word "judgment" to "ripoff"
on any line containing "microsoft antitrust":
:g/microsoft antitrust/s/judgment/ripoff/
The cool thing about the preceding is the word "judgment" can appear anywhere
in relation to the word "microsoft antitrust", and yet the deed will be done.
Perhaps you want to make the change anywhere near the word "microsoft antitrust",
where "anywhere near" is defined as the line, or anything within 2 lines
before it or 2 lines after it. Watch this:
:g/microsoft antitrust/-2,+2s/judgment/ripoff/c
You notice the c (for confirm) at the end? Any time you do anything as adventurous
as the preceding, you want to check your work. It would be terrible to write
an article and inadvertently change "I used my best judgment" to "I used
my best ripoff".
The Anatomy of a Global Command
The following are how a global command is built. The first runs the command
on every line within the range that matches the pattern. The second runs
the command on every line within the range that DOESN'T match the pattern:
:[range]g/{pattern}/[command]
:[range]g!/{pattern}/[command]
If the range is not given (and it usually is not), the default range is every
line in the file. The concept is simple enough. The command is run on every
matching (or non-matching with the exclamation point form) line. Where it
gets more complex is when the command gets complex.
Real Life Global Commands
Certainly the simplest command is the delete command:
:g/#/d
The preceding deletes every line with a pound sign. The following deletes every line that DOES NOT contain the pound sign:
:g!/#/d
The global command is often used to move certain classes of lines to a second file:
:g!/^\t\t/.w>>top2levels.otl
The preceding command appends all lines that don't start with 2 tabs. In
a tab-indented outline, that would be the top 2 levels. The carat at the
start of the pattern is a wildcard representing start of line, as will be
explained in the article on regular expressions.
Another common command used with the Global command is the substitute command.
In this case the substitute command substitutes based on a criteria totally
different from the criteria by which the line is chosen. Typically the pattern
by which the line is chosen can occur anywhere on the line in relation to
the substitute command, so you cannot include the selection criteria in a
single substitute command. Even when the selection criteria is predictable,
the global command often simplifies it. Which would you rather do:
:g/^master_/s/slave/SLAVE/g
or
:%s/^\(master_.*\)slave/\1SLAVE/g
The latter command uses groups to "remember" the match, and places that remembered text in the replace text in the form of \1.
Even the somewhat convoluted second form is not completely accurate, because
if there are multiple instances of the word "slave", VI's native long matching
will cause it to only change the last instance of "slave" on each line. Long
and short matching are explained in the article on regular expressions.
Summary
Substitute and Global commands offer immense power, but that fact is not
obvious from this article. After all, most examples in this article simply
looked for one string and replaced it with another. Ranges added some power,
and so did the one-two punch of a Global command whose command upon match is a Substitute command.
But until you introduce Regular Expressions, the Global and Substitute commands
are just parlor tricks of a second rate magician. Regular Expressions transmogrify
these commands into wizardry. Read on...
Regular Expressions
By Steve Litt
VI is the most powerful editor I've seen. It absolutely
dwarfs the WordPerfect editor I loved for so many years. The person truly adept
at VI can accomplish file translations and conversions typically requiring
a computer program. Indeed, I've written Vim scripts that convert a VimOutliner
outline to an htmlslides presentation. Another script reads an issue of Troubleshooting
Professional Magazine or Linux Productivity Magazine and creates a table
of contents that can be pasted into the magazine. Yet another converts a
tab-indented outline into a LyX document. Yet another formats data pasted
from my sales database into mailing labels. These are very simple scripts
-- not a one is more than 20 lines.
Better yet, VI can be used on the fly to do work usually requiring programming.
I've taken data files and converted them to a program to produce them. I've
added, deleted and reordered fields in fixed length and comma delimited data
files.
Regular expressions are a fundamental component of this power. Simply put,
regular expressions are searchable expressions that match a class of strings
in the file. As a basic example, the following VI search command:
/web\s*master
matches "webmaster", or "web master", or "web master",
because \s represents a single character of any whitespace, and an asterisk
means "zero or more of the preceding character". Here are some other examples:
/[aeiouAEIOU]
|
Find the next vowel.
|
/[^aeiouAEIOU[:space:][:punct:][:digit:]]
|
Find the next non-vowel letter.
|
:%s/Windows/Linux/g
|
Replace all instances of "Windows" with "Linux".
|
:%s/\([^,]*\),\([^,]*\),\(.*\)/\2,\1,\3/
|
Exchange the first two fields in each row of a file of comma delimited records.
|
| :s/\<\(.\)\([^[:space:][:punct:]]*\)\>/\u\1\2/g
|
Capitalizes the first letter of every word on a line (title case converter)
|
That's some pretty serious power for the person expert at creating regular
expressions. Entire books have been written about regular expressions. However,
this article will give you the 10% to do 90% of what you need to do in VI.
Character Sets
A character set is a wildcard. When compared to a character, there is a match
if the character is one of those in the character set. There are several
types of character sets, but the most basic is a list of matching characters
inside of brackets. For example, [aeiou] matches any lower case vowel. The list could be a range like [0-9], which matches any decimal digit. The brackets can contain a combination. For instance, [0-9.] matches any decimal digit or a decimal point (within brackets a dot is NOT a wildcard).
If the characters inside of the brackets start with a carat (^), the character set matches anything except the characters inside the brackets. So [^a-zA-Z]
matches any character except a letter in the alphabet. If you actually want
to match a carat, simply make it not the first character, as in [@^#], which matches either an at sign, a carat, or a pound sign.
|
IMPORTANT
A character set matches only a single character. It cannot match a phrase.
It is only when character sets are grouped with other character sets, other
characters, and iterators, that they can match phrases.
|
Many character sets are so common that VI shortcuts are provided for them. Here are some character set shortcuts:
| Shortcut |
Equivalent
Character
Set |
What it matches |
| \d |
[0-9] |
digit |
| \D |
[^0-9] |
Any character except a digit |
| \s |
[ \t\n\r] |
whitespace |
| \S |
[^ \t\n\r] |
Any character that is not whitespace |
| \w |
[a-zA-Z0-9] |
"word char" : i.e. alphanumeric, not space or punctuation |
| \W |
[^a-zA-Z0-9] |
"word separator" : i.e. space or punctuation, also
not EOL or BOL |
\l
|
[a-z]
|
Lowercase
letter. In the search expression it matches any lower case letter. In the
replace expression it lowercases the character or character representer following
it.
|
\L
|
[^a-z]
|
In the search expression \L matches any character except except a lowercase letter.
In the replace expression it makes everything that follows it, to the end
of the replace expression, lower case. However, a later \U in the replace
expression ends its effect.
|
\u
|
[A-Z]
|
Uppercase
letter. In the search expression it matches any lower case letter. In the
replace expression it capitalizes the character following it. |
\U
|
[^A-Z]
|
In the search expression \U matches any character except except an uppercase letter.
In the replace expression it makes everything that follows it, to the end
of the replace expression, upper case. However, a later \L in the replace
expression ends its effect.
|
What do you do if you need to match a control character like a newline,
carriage return, escape character or tab? In the UNIX world you could construct
the unprintable by pressing Ctrl+V followed by the control character (Ctrl+M
for the carriage return character, for instance). But that's ugly. A much
better solution is to use the following character representers:
Character Representer
|
What it represents
|
\e
|
The Escape character (decimal 27)
|
\t
|
The Tab character
|
\r
|
The carriage return character (looks like ^M in VI)
|
\b
|
The Backspace character
|
\n
|
The end of line character
|
.
|
Any character
|
There are also special strings that represent positions in a line instead of characters:
Positional
Representer
|
Example
|
Meaning
|
^
|
:s/^/cp / |
^ Means beginning of line. This example "replaces"
the beginning of line with cp followed by a space, creating a copy command
from a directory listing. Note that when you "replace" a positional representer,
there's nothing to replace, so you simply put the replaced text in the position
represented by the positional representer.
|
$
|
:s/$/ backupdir/ |
$ means end of line. This example puts a space followed by backupdir at the end of the line, finishing the copy command you started with the carat example.
|
\<
|
:s/\<sed/awk/g |
\< means a the word's beginning boundary.
It is neither the first character of the word, nor the whitespace or punctuation
before it, but instead an imaginary position between the two, or in the case
where the word is at the start of the line, it exactly overlays ^. This example finds every word starting with the string sed and replaces that string with awk. But, for instance, it would leave the word "based" unharmed, because the string sed is not at the beginning of the word.
|
\>
|
:s/\<sed\>/awk/g |
\> means a the word's ending boundary. It is neither the last
character of the word, nor the whitespace or punctuation after it, but instead
an imaginary position between the two, or in the case where the word is at
the end of the line, it exactly overlays $. This example finds every word sed and replaces that word with awk. But, for instance, it would leave the word "based" unharmed, and also the word "sediment", because the string sed is not at both the beginning and end of the search pattern. |
Because many characters are used as character representers, character set
shortcuts, and other constructs, they are not available in regular expressions
unless they're escaped. Here's a partial list of characters that must be
escaped:
\.
|
A literal period
|
\\
|
A literal backslash
|
\/
|
A literal forward slash. Note that because of the
forward slash's role in regular expressions, this escape pattern must be
used even in the replace section of search and replace regular expressions.
|
\^
|
A literal carat. Normally a carat represents "beginning of line".
|
\$
|
A literal dollar sign. Normally a dollar sign represents "end of line".
|
The VI character set shortcuts discussed so far are handy, but in addition
to those, many VI implementations recognize Posix character set shortcuts,
as shown in the following table:
| [[:alnum:]] |
|
letters and digits |
| [[:alpha:]] |
|
letters |
| [[:blank:]] |
|
space and tab characters |
| [[:cntrl:]] |
|
control characters |
| [[:digit:]] |
|
decimal digits |
| [[:graph:]] |
|
printable characters excluding space |
| [[:lower:]] |
|
lowercase letters (all letters when 'ignorecase' is used) |
| [[:print:]] |
|
printable characters including space |
| [[:punct:]] |
|
punctuation characters |
| [[:space:]] |
|
whitespace characters |
| [[:upper:]] |
|
uppercase letters (all letters when 'ignorecase' is used) |
| [[:xdigit:]] |
|
hexadecimal digits |
| [[:return:]] |
|
the <CR> character |
| [[:tab:]] |
|
the <Tab> character |
| [[:escape:]] |
|
the <Esc> character |
| [[:backspace:]] |
|
the <BS> character |
Remember, a character set is a wildcard against which you can match a single
character. You cannot use it to match a string or phrase, unless you use
enumerators...
Enumerators
Enumerators apply to a single character -- the character or character set
that precedes them in the regular expression. An enumerator declares how
many of the preceding character will match.
The most general form of an iterator is:
\{n,m}
The first number (n), is the minimum number of the preceding character needed to create a match. For instance,
A\{2, 4} matches AA, AAA, or even AAAAAAAAA, but it does not match a single A not followed or preceded by an additional A.
So you might wonder what the second number is. The second number is how far
you skip before trying to find another match. For instance, take a string
with 13 A characters in a row, preceded and followed by a space.
/A\{2,4} the first time
AAAAAAAAAAAAA
^
/A\{2,4} the second time
AAAAAAAAAAAAA
^
/A\{2,4} the third time
AAAAAAAAAAAAA
^
/A\{2,4} the fourth time (no match, because only 1 A remains)
AAAAAAAAAAAAA
The first time it matches the first 4 letters. The second time it matches
the second four. The third time it matches the third four. Now it gets interesting.
It doesn't match the last A because there's only one left. If there had been
14 A's instead of 13, it would have matched the last 2. So the last number
is the maximum match, and the number it skips ahead to begin trying for the
next match.
Remember also that the character being enumerated can be a character set.
For instance, the following finds strings of 2 to 100 instances of whitespace
(presumably to convert them to nothing, thereby squeezing out repeated spaces)
\s\{2,100}
That brings up a good point. You really wanted 2 to infinity, not 2 to 100.
Another way of saying 2 to infinity is 2 or more, and the enumerator for
that is \{2,}. That matches 2 or more consecutive A characters,
and matches all of them, so that subsequent matches won't match any more
of that consecutive run.
Perhaps you want to match exactly two of a character. The enumerator for that is \{2}.
There are shortcuts for the most common enumerators, namely, 0 or more, 1 or more, and 0 or 1:
Shortcut
|
Equivalent enumerator
|
Meaning
|
*
|
\{0,}
|
0 or more
|
\+
|
\{1,}
|
1 or more
|
\?
|
\{0,1}
|
0 or 1
|
The following matches a traditional 10 digit phone number:
/(\d\{3})[-[:space:]]\?\d\{3}-\d\{4}
The preceding is read as follows:
An opening paren, followed by exactly three digits, followed by a closing paren, followed by zero or one
instance of either a whitespace character or a dash, followed by exactly
three digits, followed by a dash, followed by exactly four digits.
Armed with character sets and enumerators, you can search and successfully
find simple to moderately complex patterns. But some things, like locating
phone numbers whether or not they have area codes, and parsing of delimited
lines, require more. Read on...
Using Matched Text as Replacement Text
Something that comes up with surprising regularity is the need to switch
2 parts of every line. For instance, consider a comma delimited data file.
What if you need to switch the first and second fields? Here's how you do
it:
:%s/\([^,]*\),\([^,]*\),\(.*\)/\2,\1,\3/
Your first reaction might be that the preceding is ugly. That's true, but
compare it to writing a program or creating a macro like you need to do in
so many other editors.
Here's a character by character explanation of the preceding command:
:%s
|
|
Substitute on every line
|
/
|
|
Begin search expression
|
\(
|
|
Begin a group. The group is ended by \). Everything
inside a search expression group is remembered such that it can be used within
the replace text.
|
[^,]
|
|
A character set that matches everything EXCEPT a comma.
In a comma delimited file, this is how you implement short matching so the
match ends on the first comma.
|
*
|
|
An enumerator indicating zero or more of the character
or character set that preceded it. In this case, it means zero or more non-commas.
Those zero or more non-commas correspond to the first field in the line.
|
\)
|
|
Ends the first group.
|
,
|
|
Matches a comma, which is the field delimiter in a comma delimited file.
|
\(
|
|
Begins the second group
|
[^,]*
|
|
Zero or more non-commas corresponding to the second field
|
\)
|
|
Ends the second group
|
,
|
|
Matches the comma that separates the second field from the third field
|
\(
|
|
Begins the third group |
.*
|
|
Zero or more characters of any type. This represents
the rest of the line. Because the only fields you're modifying are the first
and second, you can group the third, fourth, ... fields together into one
group -- the third group.
|
\)
|
|
Ends the third group
|
/
|
|
Ends the search expression and begins the replace expression
|
\2
|
|
The text that was matched by the second group of the
search expression, which, in fact, was the second field, which we are now
putting first
|
,
|
|
Literal comma to separate the first and second fields in the replaced text.
|
\1
|
|
Text matched by the first group in the search expression, which we are now putting second.
|
,
|
|
Literal comma to separate the new second field from the third
|
\3
|
|
Text matched by the third group in the search expression, which in fact represented all fields after the first two
|
/
|
|
Ends the replace expression
|
Study the preceding information. Once you understand the preceding, you understand
most of what it takes to use search information in the replace expression.
Here's another example -- a titlecase converter:
:%s/\<\(.\)\([^[:space:][:punct:]]*\)\>/\u\1\2/g
You might use this in a VI based book outline in order to titlecase all headings
for later use as parts, sections, subsections, etc. Let's quickly analyze:
:s
|
|
Substitute on every line
|
/
|
|
Start search expression
|
\<
|
|
Matches position at beginning of word (between the
space or punctuation preceding the word and the first alphanumeric of the
word
|
\(.\)
|
|
Group containing the first character of the word
|
\(
|
|
Begin second group
|
[^[:space:][:punct:]]*
|
|
Series
(zero or more) characters that are neither whitespace nor punctuation. This
makes up the remainder of the word. This is discussed further in the Short Matching section of this article.
|
\)
|
|
End second group
|
\>
|
|
Matches position at end of word (between the last alphanumeric of the word and the punctuation or space following the word)
|
/
|
|
Ends the search expression and begins the replace expression
|
\u
|
|
Means the next character or metacharacter will be capitalized
|
\1
|
|
The text that matched the first group of the search
expression. In this case that means the first character of the word, which
due to the \u before it in the replace expression, means it's capitalized.
|
\2
|
|
The text that matched the second group of the search expression. In this case that means the remainder of the word.
|
/
|
|
Ends the replace expression
|
g
|
|
Means global, meaning that you don't quit on the first
match. This is what enables this title case converter to work on every word
of the line.
|
Once again, review the preceding, especially noting the use of \< and \> to delineate words, the use of
[^[:space:][:punct:]]* to represent the remainder of a word, and the use of \u to uppercase the next item in the replace expression.
The following table lists several pieces of syntax used when you use matched text in the replacement:
\(.....\)
|
Delineates
a group. Whatever matches the expression inside the \( and \) is remembered
and available in the replace expression. In this example, any series of 5
characters matches the expression inside the \( and \). |
\1
|
Used inside the replace expression, this is replaced by the text matched by the first group in the search expression.
|
\2
|
Used inside the replace expression, this is replaced
by the text matched by the second group in the search expression. Likewise
\3 is replaced by the text matched by the third group, \4 by the text matched
by the 4th group, all the way up to \9. |
\u
|
Used in the search expression, this matches any upper
case letter. Used in the replace expression, this uppercases the next single
item in the replacement expression.
|
\l
|
Used in the search expression, this matches any lower case letter. Used in
the replace expression, this lowercases the next single item in the replacement
expression. |
\U
|
Used in the search expression, this matches any character that IS NOT an upper case letter. Used in
the replace expression, this uppercases the remainder of the replacement
expression. However, a later \L in the replace expression will "turn off" the \U at the point of appearance of the \L. |
\L
|
Used in the search expression, this matches any character that IS NOT a lower case letter. Used in
the replace expression, this lowercases the remainder of the replacement
expression. However, a later \U in the replace expression will "turn off" the \L at the point of appearance of the \U. |
Using matched text is one of VI's most powerful features. Many other word
processors require you to write a macro in order to accomplish similar things.
Of course you could write the same types of macros in VI, but once you get
used to regular expressions it takes only 1/4 to 1/10 the time.
Short Matching
Consider the following sentence:
I will roll the wheel from the whalebelly to the well.
Let's say you want to find every string that begins with w and ends with
ll, replacing it with XXX. You want the line changed to the following:
I XXX roll the wheel from the XXXy to the XXX.
How bout this:
s/w.*ll/XXX/g
The g on the end means global, meaning keep finding and replacing
after finding the first one. Unfortunately, the preceding regex will instead
produce the following:
I XXX.
What happened? Unless told otherwise, regular expressions match as
much as possible. So it found the first w (the one in will), and the last
ll (the one in well), and replaced them, and everything in between, with
XXX. How would you make the matches as short as possible? Some environments,
such as Perl, have special modifiers enabling enumerators to match as short
as possible. This isn't available in many VI implementations.
What's needed is a way to limit the scope of the enumerator. Consider the following:
:s/w[^l]*ll/XXX/g
This is much closer, producing the following:
I XXX roll the wheel from the whalebelly to the XXX.
It found "will" and "well", but missed "whalebell". What happened is the
first "l" in "whalebelly" reset the search, so that the search went back
to searching for "w", moving all the way to the last word of the line.
If we can agree that we should match long on a per word basis (in other words,
"wellwall" will be replaced by XXX and not XXXXXX), we can limit by word
boundaries instead of the "l" character:
:s/w[^[:space:][:punct:]]*ll/XXX/g
The preceding produces the desired:
I XXX roll the wheel from the XXXy to the XXX.
If you want full short matching ("wellwall" replaced by XXXXXX), you need to use advanced techniques involving groups.
Advanced Searches with Groups
By Steve Litt
I wouldn't blame you for skipping this article. It's brutal, and for 90%
of your editing activities you won't need its power. Indeed, what you can
do with groups in a regular expression would be difficult for a good programmer
using C (unless he was using regular expressions).
Let's say you want to find a phone number. Look at all the ways phone numbers can be written:
- (800)555-1212
- (800) 555-1212
- (800)-555-1212
- 800-555-1212
- 555-1212
These don't include putting 1- in front -- it can be done but I want to keep this
example understandable. How can you write a single search
expression to find all of these different methods?
The first thing to notice is that everything except the area code is simple:
/\d\{3}-\d\{4}
As things get hairy, it's important to note that the last 8 characters are a "gimme".
If you notice, either you have an area code plus optional dash or space between
it and the exchange, or you don't. So you make a group to hold the area code,
its parens or lack thereof, and option dash or space. We already discussed
groups in the Using Matched Text as Replacement Text
section of this article, where we used it to capture text for later use in
the replace expression. The more powerful use is to create incredibly versatile
searches, such as detecting all these different phone number formats.
So anyway, we can start by dreaming of a group containing area code, optional
parens, and an optional trailing separator, and prepending it onto the search
for the exchange and 4 digit number:
/\(area_code_group\)\?\d\{3}-\d\{4}
Following in the footsteps of programmers since the dawn of time, let's simply
forget about the last 7 digits and concentrate on the area code group.
One thing you know is the area code has it has parentheses or it doesn't.
The other punctuation depends on whether it has parentheses. If it has parentheses
you look for this:
/(\d\{3})[- ]\?
That's an opening paren, exactly three digits, a closing paren, and one or
zero iterations of either a space or a dash. Obviously if there are zero
iterations it matches a string where the paren buts up against the telephone
exchange.
If there are no parens, it's even simpler because you must have a dash after the area code.
\d\{3}-
We'll use the \| operator to represent alternatives. In truth I don't know
whether many VI implementations offer the \| operator, but I know that Vim
does. By the way, in case you haven't noticed, Vim is my favorite VI implementation,
though I like all VI's. Anyway, here's the first step in representing the
area code:
/(\d\{3})[- ]\?\|\d\{3}-
This matches area codes with or without parens. Remembering phone numbers
can have one or zero area codes, together with the area code's parens and
optional trailing dash or space, we put that in a group that can happen zero
or one time, and append the regex for the last 7 digits of the phone number:
/\((\d\{3})[- ]\?\|\d\{3}-\)\?\d\{3}-\d\{4}
To make this clearer, let's color code it:
/\((\d\{3})[- ]\?\|\d\{3}-\)\?\d\{3}-\d\{4}
Color
|
Regex
|
Explanation
|
red
|
\( to \)
|
Group consisting of all area code and trailing punctuation alternatives
|
turquoise
|
\?
|
Signifies 0 or 1 instances of the area code group.
|
| magenta |
\d\{3}-\d\{4}
|
Phone number without area code |
| orange |
\|
|
Signifies that the element on its left and on its right are either/or alternatives |
| green |
(\d\{3})[- ]\?
|
Structure (with options) of area code with parens |
| blue |
\d\{3}-
|
Structure of the area code without parens
|
Groups can be nested, and nested, and nested some more. In VI implementations
that allow it, alternative groups can be specified with the \| operator (or
possibly a different operator in a different VI implementation).
If your head doesn't hurt too badly, for extra credit try to make a substitute
expression that converts all these phone number formats to a single format.
I can't do it in a single step, but I can do it in a 2 step process. The
first substitute finds all valid phone numbers and surrounds them with pho_ on one side and _ohp
on the other. The second finds such surrounded strings and picks off the
digits. This 2 stage approach frees the second replace from worrying about
punctuation, or mixing detection and validation with substitution.
To do this, step 1 must surround the ENTIRE search expression with \( and
\) to make it a single group, and place it in a substitute:
:%s/\(\((\d\{3})[- ]\?\|\d\{3}-\)\?\d\{3}-\d\{4}\)/pho_\1_ohp/
Step 2 formats all resulting strings that include area codes in a standard format:
:%s/pho_\D*\(\d\d\d\)\D*\(\d\d\d-\d\d\d\d\)_ohp/(\1)-\2/
You'll notice that all phone numbers that didn't include an area code were
not converted. To convert them you need to run a second, much simpler substitute
command.
I think you know where I'm going with this. If we can recognize all forms
of phone numbers, and we can also remember matched text for use in the replace
expression, then it's pretty obvious we can make a regular expression to
"standardize" phone numbers. In this case we need to surround the digits
of the phone number with slash parens. It's made more difficult because those
slash parens are nested in a major group and are alternatives.
Summary
Regular expressions are huge. Master them and you can do
in 5 minutes what others take hours to do. One of the most powerful regular
expression techniques is the use of groups. By using groups, you can search
for various alternatives, and for phrases that may or may not be there or
may be repeated. Groups can be nested.
Registers, Cut and Paste
By Steve Litt
The most significant factor in a software developer's
success is his or her design process. Next comes his or her debugging process,
including use of tools such as debuggers. Obviously, a good memory or skillful
use of reference material play an important part also.
And for those of us without total recall, cut and paste plays a vital role
in our software development productivity. The ability to copy documentation
and paste it into our code, to copy a function declaration to the code that
uses the function, and many other cut and paste tasks speed our productivity
immensely.
VI's cut and paste is richer than most other editors because of VI's vast
array of registers. A register is simply a place to store text.
The simplest cut and paste operation is to delete a single line and paste it elsewhere. The line delete is accomplished by the dd key sequence, which deletes the current line and places it in the default register, the "" register. You can then move to another location, and press the p key, which pastes the contents of the "" register below the current line.
If you had wanted to copy the line instead of cut it, you could have used the yy keystroke sequence to "yank" the line into the "" register, and then moved and used the p keystroke to paste it below the current line.
By the way, if you want to paste above the current line instead of below it, you'd use the P keystroke instead of the p keystroke.
Anyway, the "" register always contains the text of your last deletion
or yank, meaning it will quickly be be overwritten by the next delete or
yank. To save a delete or yank, delete it or yank it into a specific register.
For instance, to delete the current line and save it in the "b register, use the "bdd keystroke sequence. To yank (copy without delete) a line into register "c use the "cyy
keystroke sequence. In general, you save any deletion or yank to a named
register by preceding the delete or yank command with a doublequote and the
register letter.
To paste from a specific register, prepend a doublequote and the register letter to the paste command. Thus "bp pastes the contents of the "b register after the current position, while "eP pastes the contents of the "e register before the current position.
To see a list of all the registers and their contents, use the :reg command.
The System Clipboard Register
Some VI implementations have a special register whose contents are those
of the system clipboard. Vim is one such implementation, and in Vim that
register is the "* register. So in Linux, if you want to paste part
of a browser hosted help file into the file you're editing, highlight the
material in the browser, switch to Vim and type the "*p keystroke sequence.
Likewise, to move a line from Vim to Mozilla Composer, press "*yy,
switch to Mozilla Composer, press the middle mouse button, and watch the
line appear in Mozilla Composer. A single line isn't particularly impressive,
but with techniques discussed later in this article you can copy entire subroutines.
Various Delete and Yank Commands
Sometimes you want to delete only part of a line, instead of the complete
line. To delete to the end of the current word, you'd use the dw keystroke sequence. To copy the remainder of the current word you'd use the yw keystroke sequence. When you delete or yank a line fragment, the p and P commands act differently, in that they paste before or after the current character instead of the current line.
In general, deletion commands have corresponding yank commands. The following table lists several such commands:
Text to copy or cut
|
Delete
|
Yank
|
Char or Line
|
Current line
|
dd
|
yy
|
L
|
Current plus next 2 lines
|
3dd
|
3yy
|
L
|
Current word
|
dw
|
yw
|
C
|
Current plus next 2 words
|
2dw or d2w
|
2yw or y2w
|
C
|
Current plus previous 2 words
|
2db
|
2yb
|
C
|
Current char to end of line
|
D or d$
|
y$
|
C
|
Preceding char to beginning of line
|
d0
|
y0
|
C
|
Character under cursor
|
x
|
|
C
|
Character before cursor
|
X
|
|
C
|
Remainder of paragraph
|
d}
|
y}
|
C
|
Earlier part of paragraph
|
d{
|
y{
|
C
|
Current char to start of searchtext
|
d/searchtext
|
y/searchtext
|
C
|
| Preceding char to start of searchtext |
d?searchtext |
y?searchtext |
C
|
To end of sentence (not counting the period)
|
d\/.
|
y\/.
|
C
|
Notice in the preceding that many of the commands are composed by appending a motion command to the end of the d or y. The motion command can be a built in one like w or }, or it can be a search. Either way, all the characters traversed in the motion are deleted or yanked.
WYSIWYG Cut and Paste
Many VI implementations, including Vim, allow visual mode highlighting, in which case simply pressing the d or y
key will delete or yank the highlighted text. For instance, in Vim you can
highlight line by line by pressing the V key and then moving with motion commands
such as j, k, } or a search. You can start highlighting character by character with the v key.
And in Vim, you can start rectangular highlighting with the Ctrl+v keystroke
combination. This extremely powerful capability allows you to switch fields
around in a fixed length sequential record file. You can also build up such
a file field by field by typing a field's contents on each line, and then
rectangular cut and paste that onto the end of the preceding fields.
Rectangular delete is excellent for deleting the first several characters
from several lines or all lines in the file, or for deleting past a certain
column.
Summary
VI offers a wide variety of delete, cut, copy and paste techniques
that are keyboard easy. The practitioner knowing the techniques listed in
this article will be able to edit quickly. Your VI implementation will probably
have many additional techniques not discussed in this article, giving you
the opportunity to be even faster.
In VI the concepts of "cut" and "delete" are identical -- deleting text always sends it to the ""
register, which could be thought of as "the VI clipboard". This concept is
called "delete" in VI, and is implemented with various keystroke sequences
starting with d. The concept of "copy" is called "yank" in VI, and is typically implemented with a various keystroke sequences starting with y.
Deletes and yanks can either be character by character or line by line. The distinction is important when pasting.
VI has two paste commands, p and P. The lower case paste
command pastes after the current position, while the upper case version pastes
before the current position. If the previously deleted or yanked material
was line by line, the pasted material goes in new lines opened either after
(p) or before (P) the current line. If the previously deleted
or yanked material was character by character, the new material starts on
the current line either after (p) or before (P) the character over which the cursor resides.
Maps and Abbreviations
By Steve Litt
You can map substantial commands to simple keystrokes using the :map command. For instance:
map ,,1 :set foldlevel=1<CR>
Note: if your VI implementation does not accept <CR> as a carriage
return equivalent, just make the real thing with Ctrl+V followed by Ctrl+M.
After the preceding command, any time that within command mode you press
the keystroke sequence ,,1 your foldlevel gets set to 1. But what if you
want it to work within insert mode also? In that case you might do something
like this:
:map! ,,1 <ESC>:set foldlevel=1<CR>a
Once again, if your implementation doesn't recognize the text "<ESC>"
use Ctrl+V followed by the Esc key to place a literal Esc character.
The exclamation point after the word "map" indicates this mapping should
be in insert mode, not in command mode. So the command first does the <Esc>
command to get out of insert mode, then sets the foldlevel, and then runs
the append command (a) to get back into insert mode. To the user it seems
like ,,1 set the foldlevel from within insert mode.
An abbreviation is very similar to a map! command. It works from within insert
mode only. Maps render instantly, but abbreviations do not render until another
character is pressed. So for instance,
map! sl Steve Litt
after the preceding command, sl expands to Steve Litt as soon as the l is pressed. On the other hand,
abbr sl Steve Litt
after the preceding command, sl expands to Steve Litt only after the sl is
followed by a space or punctuation. If it's followed, by, let's say, ant,
the word "slant" is the result. Abbreviations are usually used to abbreviate
common phrases, while map! commands are typically used to run commands.
Maps and abbreviations can greatly speed productivity. Maps are often used to provide a quick entry to macros and scripts...
Macros and Scripts
By Steve Litt
A macro is a sequence of keystrokes to accomplish
a task. A script is a sequence of commands to accomplish a task. In Scripts
are usually EX mode commands (the kind of commands that start with a colon
(:)), while macros are usually command mode commands such as dd, i, and typed in text. Macros are stored in registers, while scripts are stored in files. Macros are run by typing @ and then the register letter, such as @b. Scripts are run by typing :source followed by the filename of the file containing the macro:
:source myscriptfile
Macros are recorded by pressing the q key followed by a letter representing the register in which to store the macro. The recording process is ended when a second q is pressed while in command mode.
So, just like so many other editors, VI enables you to construct a macro
interactively in real time, and then later play the macro back. Macros assume
even more power when you include a multiple by prepending the number of times
to do it. For instance, if you wanted to append the word "Done" to 10 lines,
you could create the following macro in register b:
A Done^[j0
The preceding appends the word Done to the end of the line, types an Esc
character to get back into command mode, goes to the next line and
then goes to the beginning of the line. So you've appended "Done" and then
gone to the beginning of the next line. If you then the type 10@b keystroke sequence, the current line and the next 9 have the word "Done" appended to them. Better yet, if you type 1000@b
and there are only 20 lines in the file, the current line and every succeeding
line is appended, but at end of file the other 980 iterations are ignored.
Macros are the way editors without regular expressions do much of their magic,
and in fact, VI can do the same type of magic with macros.
Let's review some distinctions between macros and scripts:
Property
|
Macro
|
Script
|
What it is
|
A series of command mode character commands
|
A series of EX mode colon commands
|
Where it's stored
|
In a register
|
In a file
|
How it's recorded
|
Interactively following the q command
|
Programatically by a text editor
|
How recording is stopped
|
A q command within command mode
|
n/a
|
How it's run
|
@ followed by the register letter
|
:source followed by the filename
|
How it's run multiple times
|
Precede the @ with a multiplier
|
n/a
|
How to append to it
|
Typing q followed by the upper case register letter appends to the existing register macro.
|
In a text editor.
|
How to edit it
|
- For example, assume register b contains the macro
- Open a blank line in VI
- "bP to insert the register b macro procedures on the line
- Edit the line as desired
- Move to the beginning of the line
- "bD to delete the line to register b
|
In a text editor. |
Running one from the other
|
Include :source filename<CR> in the macro while in command mode
|
:normal @b
(see note)
|
|
NOTE
The :normal @b syntax works only in Vim, and possibly a few other
VI implementations. It doesn't work in the "standard" VI from the UNIX world.
|
Scripts
At the dawn of computerized civilization, before the invention of the VI
full screen editor, there was a line editor called EX. EX was incredibly
powerful, but being a line editor with a command line, it was spectacularly
unproductive. Then somebody built VI as a full-screen front end to EX. The
authors of VI left several ways to access the underlying EX. Some of the
more common ones are:
- You can run a single EX command by preceding it with the colon character
- You can put a series of EX commands in a script and run it with the :source command.
- You can go into EX mode by running VI with the -e option.
#3 is rarely used today and won't be discussed. #1 has been mentioned frequently
so far in this magazine. #2 is the subject of this section.
The following is a very abbreviated list of EX commands you can use in scripts:
COMMAND
|
EFFECT
|
:q
|
Quit VI
|
:q!
|
Quit even if there are unsaved changes
|
:e filename
|
Edit a different file
|
:e! filename
|
Edit a different file even if this one has unsaved changes
|
:e!
|
Edit the disk version of this file, even if there
are unsaved changes in the memory version. This is basically a revert
command.
|
:bn
|
Go to the next buffer. For each file opened with an
:e command there is a buffer, so you can switch from buffer to buffer, as
long as the current one is saved. This command wraps, meaning once you pass the last buffer, you go back to the first one
|
:bp
|
Go to the previous buffer. For each file opened with an :e command there is a
buffer, so you can switch from buffer to buffer, as long as the current one
is saved. This command wraps, meaning once you pass the first buffer, you go back to the last (highest numbered) one |
:b3
|
Go
to buffer # 3. Obviously you can put any number in there to go to its corresponding
buffer. If you put a number higher than the last buffer, you get an error
message.
|
:bl
|
Go to the last (highest number) in the buffer list.
|
:bd
|
Delete the current buffer, so the file is no longer in memory (but it's still on disk).
If the buffer deleted is #5, the former #6 becomes #5, and all higher numbered
buffers are renumbered to be one less than their former values, so there
are no "holes" in the list of buffers.
|
:buffers
|
Lists all buffers -- by number, with filename
|
:/search
|
Search forward and goto the first matching line
|
:?search
|
Search backward and goto the first matching line
|
:s/search/repl/
|
The substitute command. Can be used with ranges as discussed Ranges section of the Search and Replace: Substitute and Global Commands article. |
:g/search/cmd
|
The global command. Can be used with ranges as discussed Ranges section of the Search and Replace: Substitute and Global Commands article. |
:set
|
Sets a VI property
|
:let x = 1
|
Variable assignment
|
:d
|
Delete line. Can be used with ranges as discussed Ranges section of the Search and Replace: Substitute and Global Commands article. |
:put
|
Paste the content from the last delete or yank. Can be used with an optional line number and an optional register. See :h :put for details.
|
:!sysCmd
|
Filters lines through an operating system command. Can be used with ranges as discussed Ranges section of the Search and Replace: Substitute and Global Commands article. The lines in the range are sent to the command, and are replaced by the output of the command. Common examples:
sysCmd
|
RESULT
|
:%sort
|
Sort the file
|
:%sort -n
|
Sort file numerically
|
:%sort -u
|
Sort unique -- delete dups
|
:cat -n
|
Prepend line numbers to each line
|
|
:r filename
|
Inserts the contents of filename below the current line.
|
:r!sysCmd
|
Inserts the output of sysCmd below the current line. A few common examples:
sysCmd
|
RESULT
|
/sbin/ifconfig
|
Inserts network information
|
ls -l
|
Inserts a directory listing
|
galeon --help
|
Inserts Galeon's usage information
|
|
:w
|
Saves the current file
|
:w filename
|
Saves the current file as filename. WARNING!!! The
buffer you are editing is the old filename, not the new one. To edit the
new filename, perform the :e! command immediately after :w filename.
|
:4,8w filename
|
Saves lines 4 through 8 of the current file to filename.
|
:4,8w>>filename
|
Appends lines 4 through 8 to filename.
|
:normal ddp
|
Example to the left switches the current line with the one below it. :normal runs a command mode Vim command from the EX prompt. This is not in VI, but it is in Vim and possibly other implementations.
|
Convenient Set Commands
SET_COMMAND
|
EFFECT
|
:set textwidth=0
|
Turn off wordwrap during typing, but the gq command still changes line ends.
|
:set textwidth=99999
|
Turns off wordwrap during typing, and forces the gq
command to concatenate all adjacent nonblank lines. If you previously composed
an email with textwidth 65, and want to paste it into a WYSIWYG mail client
like kmail, you'd use :set textwidth=99999 and then gq commands to crunch
all adjacent nonblank lines into a single line so that kmail can wrap the
line.
|
:set ai
:set noai
|
Set or unset autoindent. Autoindent causes new lines to start at the same column position as the line above them.
|
:set ic
:set noic
|
Set or unset ignorecase. When ignorecase is set, searches,
substitutes and global commands match in a case insensitive manner. When
it's unset, searches, substitutes and global commands are case sensitive
in matching.
|
:set fileformat=unix
:set fileformat=dos
|
Puts the proper line endings on lines, according to operating system. DOS lines end with crlf, while UNIX lines end with lf.
|
:syntax off
:syntax on
|
Sets or unsets syntax specific editing (coloration, etc)
|
:set syntax=perl
|
Sets the syntax language. Perl, C, Java, HTML, and many more languages are supported.
|
Functions
In Vim, and probably some other VI implementations, your scripts can have
functions. This provides power to accomplish almost any task, including turning
VI into a different product (like an outliner). In Vim scripts, functions look something
like this:
function MakeChars(count,char) let i = 0 let chars="" while i < a:count let chars = chars . a:char let i = i + 1 endwhile return chars endfunction
|
In the preceding, you see functions are delineated by their declaration, and by the word endfunction. Arguments are defined in the declaration, and the return statement is what you'd expect. Assignment is via the let statement, and you have loops and branching.
One thing not obvious is functions must not be defined multiple times, so
function definitions and declarations must be surrounded by code to prevent
multiple definition. It looks something like this:
if !exists("loaded_vimoutliner_functions") let loaded_vimoutliner_functions=1
function MakeSpaces(count) let spaces = MakeChars(a:count," ") return spaces endfunction
function MakeDashes(count) let dashes = MakeChars(a:count,"-") return dashes endfunction
endif
|
For more about functions, use the :h functions command to see info about built in functions, and the :h func command to see function definition syntax.
Programmers Helpers
By Steve Litt
Like most other editors, VI has features to help editing source code. This
article lists the ones common to most VI implementations. The Vim article later in this magazine discusses some Vim extensions making source code editing even easier.
These programmers helpers are common to most VI implementations
- Autoindent
- Bracket matching
- Tags
Autoindent
You turn on autoindent with the :set ai command. You turn it off with the :set noai
command. With autoindent on, every time you press Enter in insert mode, the
cursor comes to rest below the beginning of the last line, making programming
indentation much easier.
Various VI implementations have other indentation settings designed to "help"
the programmer, and in my opinion most of them are best turned off. There
are settings which convert spaces to equivalent tabs and tabs to equivalent
spaces, which to me is incredibly confusing. Whether you indent using tabs
or spaces, you want it consistent. I therefore suggest the following:
:set ai
:set noexpandtab
:set nosmarttab
:set softtabstop=0
:set nosmartindent
:set nocindent
The preceding uses autoindent, but none of the other "helpful" options that
can end up giving your lines arbitrary combinations of tabs and spaces. Also,
it tells VI not to impose its own indentation rules, but rather to follow
your rules and assist you by assuming a new line starts at the same position
as the one above it.
When using autoindent you'll find some indenting commands handy:
COMMAND
|
EFFECT
|
>>
|
Indent in command mode
|
<<
|
Exdent in command mode
|
Tab
|
Indent in insert mode (when at start of line)
|
Ctrl+T
|
Indent in insert mode (from any point on the line)
|
Ctrl+D
|
Extent in insert mode (from any point on the line)
|
Bracket Matching
In an ideal world we'd all indent and document so well that matching beginning
and ending brackets, braces and parens would be intuitively obvious to the
most casual observer. In that ideal world, all maintenance programmers would
retain the obvious matching, and even during development indentation would
be perfect and brackets, braces and parens would never accidentally be omitted.
For those of us not living in the ideal world, there's bracket matching.
In command mode, simply place the cursor on a square bracket, curly brace
or paren, and press the percent key (%). The cursor will jump to
the complementary square bracket, curly brace or paren. This is a 5 second
test to see whether your source code is doing what you really think it is.
This is a great debugging technique to find the missing or extra brace that
throws off everything else. Better yet, it's a quality control test. If,
upon finishing a subroutine, you bracket match the beginning brace with the
ending brace, you will have removed a large percentage of problems before
testing your code.
Some VI implementations, such as Vim, have syntax highlighting, which makes
a blown brace instantly visible. But bracket matching is the conclusive test.
Tags
"Now what were the arguments to that function again?" Sound familiar? Like
most modern professional grade editors, VI has a facility such that when
you encounter a subroutine usage, you can jump to other places where it's
used, as well as where it's declared and where it's defined. This is done
by tagging.
Here's the simplest way to implement tagging on a project. Find a directory
with a programming project. C is ideal, C++ will probably work, as will Perl
and some other languages. From within that directory, run the following command:
ctags *
That will create a new file called tags. If you were to look at
the new file you'd notice it's a list of many of your project's variables,
functions and classes, along with the files containing their definitions
and some other information. Once that file is created, go into one of your
source files (not a header file), find a function call, put the cursor over
the function name, and press the Ctrl+] key. You'll be brought to the definition
of the function. Press the Ctrl+T key and you'll be returned to where you
came from. This works for function names, global variables, class names,
and even #ifdef constants.
The preceding discussion is the simplest possible use of tagging. For real
life projects I recommend you review information on the ctags program, keeping in mind there are other programs that construct tag files. Pick the best program and options for your needs.
Vim
By Steve Litt
If you use Redhat or Mandrake, the vi command is symlinked to Vim.
Vim is a VI implementation, or to be absolutely correct, a VI workalike,
where VI is the editor originally packaged with UNIX.