Top

Learning the GNU development tools

Learning the GNU development tools

--- The Detailed Node Listing ---

Installing GNU Software

Writing Good Programs

Using GNU Emacs

Compiling with Makefiles

The GNU build system

Using Automake and Autoconf

Using Libtool

Using Autotoolset

Using C and C++ effectively

Using Fortran effectively

Internationalization

Maintaining Documentation

Portable shell programming

Writing Autoconf macros

Legal issues with Free Software

Philosophical issues

Licensing Free Software

GNU GENERAL PUBLIC LICENSE

Preface

The purpose of this document is to introduce you to the GNU build system, and show you how to use it to write good code. It also discusses peripheral topics such as how to use GNU Emacs as a source code navigator, how to write good software, and the philosophical concerns behind the free software movement. The intended reader should be a software developer who knows his programming languages, and wants to learn how to package his programs in a way that follows the GNU coding standards.

This manual introduces you to the GNU build system and shows you how to develop high-quality

This manual shows you how to develop high-quality software on GNU using the GNU build system that conforms to the GNU coding standards. These techniques are also useful for software development on GNU/Linux and most variants of the Unix system. In fact, one of the reasons for the elaborate GNU build system was to make software portable between GNU and other similar operating systems. We also discuss peripheral topics such as how to use GNU Emacs as an IDE (integrated development environment), and the various practical, legal and philosophical concerns behind software development.

When we speak of the GNU build system we refer primarily to the following four packages:

Autoconf produces a configuration shell script , named configure , which probes the installer platform for portability related information which is required to customize makefiles, configuration header files, and other application specific files. Then it proceeds to generate customized versions of these files from generic templates. This way, the user will not need to customize these files manually.

produces a , named , which probes the installer platform for portability related information which is required to customize makefiles, configuration header files, and other application specific files. Then it proceeds to generate customized versions of these files from generic templates. This way, the user will not need to customize these files manually. Automake produces makefile templates, Makefile.in to be used by Autoconf, from a very high level specification stored in a file called Makefile.am . Automake produces makefiles that conform to the GNU makefile standards, taking away the extraordinary effort required to produce them by hand. Automake requires Autoconf in order to be used properly.

produces makefile templates, to be used by Autoconf, from a very high level specification stored in a file called . Automake produces makefiles that conform to the GNU makefile standards, taking away the extraordinary effort required to produce them by hand. Automake requires Autoconf in order to be used properly. Libtool makes it possible to compile position independent code and build shared libraries in a portable manner. It does not require either Autoconf, or Automake and can be used independently. Automake however supports libtool and interoperates with it in a seamless manner.

makes it possible to compile position independent code and build shared libraries in a portable manner. It does not require either Autoconf, or Automake and can be used independently. Automake however supports libtool and interoperates with it in a seamless manner. Autotoolset FIXME: Add content

The GNU build system has two goals. The first is to simplify the development of portable programs. The second is to simplify the building of programs that are distributed as source code. The first goal is achieved by the automatic generation of a configure shell script. The second goal is achieved by the automatic generation of Makefiles and other shell scripts that are typically used in the building process. This way the developer can concentrate on debugging his source code, instead of his overly complex Makefiles. And the installer can compile and install the program directly from the source code distribution by a simple and automatic procedure.

The GNU build system needs to be installed only when you are developing programs that are meant to be distributed. To build a program from distributed source code, you only need a working make , a compiler, a shell, and sometimes standard Unix utilities like sed , awk , yacc , lex . The objective is to make software installation as simple and as automatic as possible for the installer. Also, by setting up the GNU build system such that it creates programs that don't require the build system to be present during their installation, it becomes possible to use the build system to bootstrap itself.

Some tasks that are simplified by the GNU build system include:

Building multi directory software packages. It is much more difficult to use raw make recursively. Having simplified this step, the developer is encouraged to organize his source code in a deep directory tree rather than lump everything under the same directory. Developers that use raw make often can't justify the inconvenience of recursive make and prefer to disorganize their source code. With the GNU tools this is no longer necessary.

recursively. Having simplified this step, the developer is encouraged to organize his source code in a deep directory tree rather than lump everything under the same directory. Developers that use raw often can't justify the inconvenience of recursive make and prefer to disorganize their source code. With the GNU tools this is no longer necessary. Automatic configuration. You will never have to tell your users that they need to edit your Makefile. You yourself will not have to edit your Makefiles as you move new versions of your code back and forth between different machines.

Automatic makefile generation. Writing makefiles involves a lot of repetition, and in large projects it will get on your nerves. The GNU build system instead requires you to write Makefile.am files that are much more terse and easy to maintain.

files that are much more terse and easy to maintain. Support for test suites. You can very easily write test suite code, and by adding one extra line in your Makefile.am make a check target available such that you can compile and run the entire test suite by running make check .

make a target available such that you can compile and run the entire test suite by running . Automatic distribution building. The GNU build tools are meant to be used in the development of free software , therefore if you have a working build system in place for your programs, you can create a source code distribution out of it by running make distcheck .

, therefore if you have a working build system in place for your programs, you can create a source code distribution out of it by running . Shared libraries. Building shared libraries becomes as easy as building static libraries.

The Autotoolset package complements the GNU build system by providing the following additional features:

Automatic generation of legal notices that are needed in order to apply the GNU GPL license.

Automatic generation of directory trees for new software packages, such that they conform to the GNITS standard (more or less).

Some rudimentary portability framework for C++ programs. There is a lot of room for improvement here, in the future. Also a framework for embedding text into your executable and handling include files across multiple directories.

Support for writing portable software that uses both Fortran and C++.

Additional support for writing software documentation in Texinfo, but also in LaTeX.

This effort began by my attempt to write a tutorial for Autoconf. It involved into “Learning Autoconf and Automake”. Along the way I developed Autotoolset to deal with things that annoyed me or to cover needs from my own work. Ultimately I want this document to be both a unified introduction of the GNU build system as well as documentation for the Autotoolset package.

I believe that knowing these tools and having this know-how is very important, and should not be missed from engineering or science students who will one day go out and do software development for academic or industrial research. Many students are incredibly undertrained in software engineering and write a lot of bad code. This is very very sad because of all people, it is them that have the greatest need to write portable, robust and reliable code. I found from my own experience that moving away from Fortran and C, and towards C++ is the first step in writing better code. The second step is to use the sophisticated GNU build system and use it properly, as described in this document. Ultimately, I am hoping that this document will help people get over the learning curve of the second step, so they can be productive and ready to study the reference manuals that are distributed with all these tools.

This manual of course is still under construction. When I am done constructing it some paragraph somewhere will be inserted with the traditional run-down of summaries about each chapter. I write this manual in a highly non-linear way, so while it is under construction you will find that some parts are better-developed than others. If you wish to contribute sections of the manual that I haven't written or haven't yet developed fully, please contact me.

Chapters 1,2,3,4 are okay. Chapter 5 is okay to, but needs a little more work. I removed the other chapters to minimize confusion, but the sources for them are still being distributed as part of the Autotoolset package for those that found them useful. The other chapters need a lot of rewriting and they would do more harm than good at this point to the unsuspecting reader. Please contact me if you have any suggestions for improving this manual.

Remarks by Marcelo: I am currently updating this manual to the last release of the autoconf/automake tools.

Acknowledgements

This document and the Autotools package have originally been written by Eleftherios Gkioulekas. Many people have further contributed to this effort, directly or indirectly, in various way. Here is a list of these people. Please help me keep it complete and exempt of errors.

Philosophical issues, and Licensing Free Software, were written by Richard Stallman. Richard has also contributed many useful review comments and helped me with the legal paperwork. The appendix Philosophical issues has been written by Richard Stallman. (see Philosophical issues)

The chapter on Fortran, and the Autotools support for developing software that is partly written in Fortran is derived from the work of John Eaton on GNU Octave, which I mainly generalized for use in other programs. (see Using Fortran effectively).

Mark Galassi was the first person, to the best of my knowledge, who tried to write an Autoconf tutorial. It is thanks to his work that I was inspired to begin this work.

FIXME: I need to start keeping track of acknowledgements here

Copying

This book that you are now reading is actually free. The information in it is freely available to anyone. The machine readable source code for the book is freely distributed on the internet and anyone may take this book and make as many copies as they like. (take a moment to check the copying permissions on the Copyright page). If you paid money for this book, what you actually paid for was the book's nice printing and binding, and the publisher's associated costs to produce it.

The following notice refers to the Autotoolset package, which includes this documentation, as well as the source code for utilities like acmkdir and for additional Autoconf macros. The complete GNU development tools involves other packages also, such as Autoconf , Automake , Libtool , Make , Emacs , Texinfo , the GNU C and C++ compilers and a few other accessories. These packages are free software, and you can obtain them from the Free Software Foundation. For details on doing so, please visit their web site http://www.fsf.org/ . Although Autotoolset has been designed to work with the GNU build system, it is not yet an official part of the GNU project.

The Autotoolset package is also “free”; this means that everyone is free to use it and free to redistribute it on a free basis. The Autotoolset package is not in the public domain; it is copyrighted and there are restrictions on its distribution, but these restrictions are designed to permit everything that a good cooperating citizen would want to do. What is not allowed is to try to prevent others from further sharing any version of this package that they might get from you.

Specifically, we want to make sure that you have the right to give away copies of the programs that relate to Autotoolset , that you receive source code or else can get it if you want it, that you can change these programs or use pieces of them in new free programs, and that you know you can do these things.

To make sure that everyone has such rights, we have to forbid you to deprive anyone else of these rights. For example, if you distribute copies of the Autotoolset -related code, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights.

Also, for our own protection, we must make certain that everyone finds out that there is no warranty for the programs that relate to Autotoolset . If these programs are modified by someone else and passed on, we want their recipients to know that what they have is not what we distributed, so that any problems introduced by others will not reflect on our reputation.

The precise conditions of the licenses for the programs currently being distributed that relate to Autotoolset are found in the General Public Licenses that accompany them.

1 Installing GNU software

Free software is distributed in source code distributions. Many of these programs are difficult to install because they use system dependent features, and they require the user to edit makefiles and configuration headers. By contrast, the software distributed by the GNU project is autoconfiguring ; it is possible to compile it from source code and install it automatically, without any tedious user intervention.

In this chapter we discuss how to compile and install autoconfiguring software written by others. In the subsequent chapters we discuss how to use the development tools that allow you to make your software autoconfiguring as well.

1.1 Installing a GNU package

Autoconfiguring software is distributed with packaged source code distributions. These are big files with filenames of the form:

package - version .tar.gz

For example, the file autoconf-2.57.tar.gz contains version 2.57 of GNU Autoconf. We often call these files source distributions ; sometimes we simply call them packages .

The steps for installing an autoconfiguring source code distribution are simple, and if the distribution is not buggy, can be carried out without substantial user intervention.

First, you have to unpack the package to a directory: % gunzip foo-1.0.tar.gz % tar xf foo-1.0.tar This will create the directory foo-1.0 which contains the package's source code and documentation. Look for the files README to see if there's anything that you should do next. The README file might suggest that you need to install other packages before installing this one, or it might suggest that you have to do unusual things to install this package. If the source distribution conforms to the GNU coding standards, you will find many other documentation files like README . See Maintaining the documentation files, for an explanation of what these files mean. Configure the source code. Once upon a time that used to mean that you have to edit makefiles and header files. In the wonderful world of Autoconf, source distributions provide a configure script that will do that for you automatically. To run the script type: % cd foo-1.0 % ./configure Now you can compile the source code. Type: % make and if the program is big, you can make some coffee. After the program compiles, you can run its regression test-suite, if it has one, by typing % make check If everything is okey, you can install the compiled distribution with: % su # make install

The make program launches the shell commands necessary for compiling, testing and installing the package from source code. However, make has no knowledge of what it is really doing. It takes its orders from makefiles , files called Makefile that have to be present in every subdirectory of your source code directory tree. From the installer perspective, the makefiles define a set of targets that correspond to things that the installer wants to do. The default target is always compiling the source code, which is what gets invoked when you simply run make . Other targets, such as ‘ install ’, ‘ check ’ need to be mentioned explicitly. Because make takes its orders from the makefile in the current directory, it is important to run it from the correct directory. See Compiling with Makefiles, for the full story behind make .

The configure program is a shell script that probes your system through a set of tests to determine things that it needs to know, and then uses the results to generate Makefile files from templates stored in files called Makefile.in . In the early days of the GNU project, developers used to write configure scripts by hand. Now, no-one ever does that any more. Now, configure scripts are automatically generated by GNU Autoconf from an input file configure.in . GNU Autoconf is part of the GNU build system and we first introduce in in The GNU build system.

As it turns out, you don't have to write the Makefile.in templates by hand either. Instead you can use another program, GNU Automake, to generate Makefile.in templates from higher-level descriptions stored in files called Makefile.am . In these files you describe what is being created by your source code, and Automake computes the makefile targets for compiling, installing and uninstalling it. Automake also computes targets for compiling and running test suites, and targets for recursively calling make in subdirectories. The details about Automake are first introduced in Using Automake and Autoconf.

1.2 The Makefile standards

The GNU coding standards are a document that describes the requirements that must be satisfied by all GNU programs. These requirements are driven mainly by technical considerations, and they are excellent advice for writing good software. The makefile standards , a part of the GNU coding standards, require that your makefiles do a lot more than simply compile and install the software.

One requirement is cleaning targets ; these targets remove the files that were generated while installing the package and restore the source distribution to a previous state. There are three cleaning targets that corresponds to three levels of cleaning: clean , distclean , maintainer-clean .

clean Cleans up all the files that were generated by make and make check , but not the files that were generated by running configure . This targets cleans the build, but does not undo the source configuration by the configure script.

distclean Cleans up all the files generated by make and make check , but also cleans the files that were generated by running configure . As a result, you can not invoke any other make targets until you run the configure script again. This target reverts your source directory tree back to the state in which it was when you first unpacked it.

maintainer-clean Cleans up all the files that distclean cleans. However it also removes files that the developers have automatically generated with the GNU build system. Because users shouldn't need the entire GNU build system to install a package, these files should not be removed in the final source distribution. However, it is occasionally useful for the maintainer to remove and regenerate these files.

Another type of cleaning that is required is erasing the package itself from the installation directory; uninstalling the package. To uninstall the package, you must call

% make uninstall

from the top level directory of the source distribution. This will work only if the source distribution is configured first. It will work best only if you do it from the same source distribution, with the same configuration, that you've used to install the package in the first place.

When you install GNU software, archive the source code to all the packages that you install in a directory like /usr/src or /usr/local/src . To do that, first run make clean on the source distribution, and then use a recursive copy to copy it to /usr/src . The presence of a source distribution in one of these directories should be a signal to you that the corresponding package is currently installed.

Francois Pinard came up with a cute rule for remembering what the cleaning targets do:

If configure or make did it, make distclean undoes it.

or did it, undoes it. If make did it, make clean undoes it.

did it, undoes it. If make install did it, make uninstall undoes it.

did it, undoes it. If you did it, make maintainer-clean undoes it.

GNU standard compliant makefiles also have a target for generating tags . Tags are files, called TAGS , that are used by GNU Emacs to allow you to navigate your source distribution more efficiently. More specifically, Emacs uses tags to take you from a place where a C function is being used in a file, to the file and line number where the function is defined. To generate the tags call:

% make tags

Tags are particularly useful when you are not the original author of the code you are working on, and you haven't yet memorized where everything is. See Navigating source code, for all the details about navigating large source code trees with Emacs.

Finally, in the spirit of free redistributable code, there must be targets for cutting a source code distribution. If you type

% make dist

it will rebuild the foo-1.0.tar.gz file that you started with. If you modified the source, the modifications will be included in the distribution (and you should probably change the version number). Before putting a distribution up on FTP, you can test its integrity with:

% make distcheck

This makes the distribution, then unpacks it in a temporary subdirectory and tries to configure it, build it, run the test-suite, and check if the installation script works. If everything is okey then you're told that your distribution is ready.

Writing reliable makefiles that support all of these targets is a very difficult undertaking. This is why we prefer to generate our makefiles instead with GNU Automake.

1.3 Configuration options

The ‘ configure ’ script accepts many command-line flags that modify its behaviour and the configuration of your source distribution. To obtain a list of all the options that are available type

% ./configure --help

on the shell prompt.

The most useful parameter that the installer controls during configuration is the directory where they want the package to be installed. During installation, the following files go to the following directories:

Executables ==> /usr/local/bin Libraries ==> /usr/local/lib Header files ==> /usr/local/include Man pages ==> /usr/local/man/man? Info files ==> /usr/local/info

The /usr/local directory is called the prefix . The default prefix is always /usr/local but you can set it to anything you like when you call ‘ configure ’ by adding a ‘ --prefix ’ option. For example, suppose that you are not a privileged user, so you can not install anything in /usr/local , but you would still like to install the package for your own use. Then you can tell the ‘ configure ’ script to install the package in your home directory ‘ /home/ username ’:

% ./configure --prefix=/home/ username % make % make check % make install

The ‘ --prefix ’ argument tells ‘ configure ’ where you want to install your package, and ‘ configure ’ will take that into account and build the proper makefile automatically.

If you are installing the package on a filesystem that is shared by computers that run variations of GNU or Unix, you need to install the files that are independent of the operating system in a shared directory, but separate the files that are dependent on the operating systems in different directories. Header files and documentation can be shared. However, libraries and executables must be installed separately. Usually the scheme used to handle such situations is:

Executables ==> /usr/local/ system /bin Libraries ==> /usr/local/ system /lib Header files ==> /usr/local/include Man pages ==> /usr/local/man/man n Info files ==> /usr/local/info

The directory /var/local/ system is called the executable prefix , and it is usually a subdirectory of the prefix. In general, it can be any directory. If you don't specify the executable prefix, it defaults to being equal to the prefix. To change that, use the ‘ --exec-prefix ’ flag. For example, to configure for a GNU/Linux system, you would run:

% configure --exec-prefix=/usr/local/linux

To configure for GNU/Hurd, you would run:

% configure --exec-prefix=/usr/local/hurd

In general, there are many directories where a package may want to install files. Some of these directories are controlled by the prefix, where others are controlled by the executable prefix. See Installation standard directories, for a complete discussion of what these directories are, and what they are for.

Some packages allow you to enable or disable certain features while you configure the source code. They do that with flags of the form:

--with- package --enable- feature --without- package --disable- feature

The --enable flags usually control whether to enable certain optional features of the package. Support for international languages, debugging features, and shared libraries are features that are usually controlled by these options. The --with flags instead control whether to compile and install certain optional components of the package. The specific flags that are available for a particular source distribution should be documented in the README file.

Finally, configure scripts can be passed parameters via environment variables. One of the things that configure does is decide what compiler to use and what flags to pass to that compiler. You can overrule the decisions that configure makes by setting the flags CC and CFLAGS . For example, to specify that you want the package to compile with full optimization and without any debugging symbols (which is a bad idea, yet people want to do it):

% export CFLAGS="-O3" % ./configure

To tell configure to use the system's native compiler instead of gcc , and compile without optimization and with debugging symbols:

% export CC="cc" % export CFLAGS="-g" % ./configure

This assumes that you are using the bash shell as your default shell. If you use the csh or tcsh shells, you need to assign environment variables with the setenv command instead. For example:

% setenv CFLAGS "-O3" % ./configure

Similarly, the flags CXX , CXXFLAGS control the C++ compiler.

1.4 Doing a VPATH build

Autoconfiguring source distributions also support vpath builds. In a vpath build, the source distribution is stored in a, possibly read-only, directory, and the actual building takes place in a different directory where all the generated files are being stored. We call the first directory, the source tree , and the second directory the build tree . The build tree may be a subdirectory of the source tree, but it is better if it is a completely separate directory.

If you, the developer, use the standard features of the GNU build system, you don't need to do anything special to allow your packages to support vpath builds. The only exception to this is when you define your own make rules (see General Automake principles). Then you have to follow certain conventions to allow vpath to work correctly.

You, the installer, however do need to do something special. You need to install and use GNU make. Most Unix make utilities do not support vpath builds, or their support doesn't work. GNU make is extremely portable, and if vpath is important to you, there is no excuse for not installing it.

Suppose that /sources/foo-0.1 contains a source distribution, and you want to build it in the directory /build/foo-0.1 . Assuming that both directories exist, all you have to do is:

% cd /build/foo-0.1 % /sources/foo-0.1/configure ...options... % make % make check % su # make install

The configure script and the generated makefiles will take care of the rest.

vpath builds are preferred by some people for the following reasons:

They prevent the build process form cluttering your source directory with all sorts of build files. To remove a build, all you have to do is remove the build directory. You can build the same source multiple times using different options. This is very useful if you would like to write a script that will run the test suite for a package while the package is configured in many different ways (e.g. different features, different compiler optimization, and so on). It is also useful if you would like to do the same with releasing binary distributions of the source.

distcheck

1.5 Making a binary distribution

After compiling a source distribution, instead of installing it, you can make a snapshot of the files that it would install and package that snapshot in a tarball. It is often convenient to the installers to install from such snapshots rather than compile from source, especially when the source is extremely large, or when the amount of packages that they need to install is large.

To create a binary distribution run the following commands as root:

# make install DESTDIR=/tmp/dist # tar -C /tmp/dist -cvf package - version .tar # gzip -9 package - version .tar

The variable DESTDIR specifies a directory, alternative to root, for installing the compiled package. The directory tree under that directory is the exact same tree that would have normally been installed. Why not just specify a different prefix? Because very often, the prefix that you use to install the software affects the contents of the files that actually get installed.

Please note that under the terms of the GNU General Public License, if you distribute your software as a binary distribution, you also need to provide the corresponding source distribution. The simplest way to comply with this requirement is to distribute both distributions together.

2 Writing Good Programs

2.1 Why good code is important

When you work on a software project, one of your short-term goals is to solve a problem at hand. If you are doing this because someone asked you to solve the problem, then all you need to do to look good in per eyes is to deliver a program that works. Nevertheless, regardless of how little person may appreciate this, doing just that is not good enough. Once you have code that gives the right answer to a specific set of problems, you will want to make improvements to it. As you make these improvements, you would like to have proof that your code's known reliability hasn't regressed. Also, tomorrow you will want to move on to a different set of related problems by repeating as little work as possible. Finally, one day you may want to pass the project on to someone else or recruit another developer to help you out with certain parts. You need to make it possible for the other person to get up to speed without reinventing your efforts. To accomplish these equally important goals you need to write good code.

2.2 Choosing a good programming language

To write a good software, you must use the appropriate programming language and use it well. To make your software free, it should be possible to compile it with free tools on a free operating system. Therefore, you should avoid using programming languages that do not have a free compiler.

The C programming language is the native language of GNU, and the GNU coding standards encourage you to program in C. The main advantages of C are that it can be compiled with the system's native compiler, many people know C, and it is easy to learn. Nevertheless, C has weaknesses: it forces you to manually manage memory allocation, and any mistakes you might make can lead to very difficult bugs. Also C forces you to program at a low level. Sometimes it is desirable to program at a low level, but there are also cases where you want to build on a higher level.

For projects where you would like a higher-level compiled language, the recommended choice is to use C++. The GNU project distributes a free C++ compiler and nowadays most GNU systems that have a C compiler also have the free C++ compiler. The main advantage of C++ is that it will automatically manage dynamic memory allocation for you. C++ also has a lot of powerful features that allow you to program at a higher level than C, bringing you closer to the algorithms and the concepts involved, and making it easier to write robust programs. At the same time, C++ does not hide low-level details from you and you have the freedom to do the same low-level hacks that you had in C, if you choose to. In fact C++ is 99% backwards compatible with C and it is very easy to mix C and C++ code. Finally, C++ is an industry standard. As a result, it has been used to solve a variety of real-world problems and its specification has evolved for many years to make it a powerful and mature language that can tackle such problems effectively. The C++ specification was frozen and became an ANSI standard in 1998.

One of the disadvantages of C++ is that C++ object files compiled by different C++ compilers can not be linked together. In order to compile C++ to machine language, a lot of compilation issues need to be deferred to the linking stage. Because object file formats are not traditionally sophisticated enough to handle these issues, C++ compilers do various ugly kludges. The problem is that different compilers do these kludges differently, making object files across compilers incompatible. This is not a terrible problem, since object files are incompatible across different platforms anyways. It is only a problem when you want to use more than one compiler on the same platform. Another disadvantage of C++ is that it is harder to interface a C++ library to another language, than it is to interface a C library. Finally not as many people know C++ as well as they know C, and C++ is a very extensive and difficult language to master. However these disadvantages must be weighted against the advantages. There is a price to using C++ but the price comes with a reward.

If you need a higher-level interpreted language, then the recommended choice is to use Guile. Guile is the GNU variant of Scheme, a LISP-like programming language. Guile is an interpreted language, and you can write full programs in Guile, or use the Guile interpreter interactively. Guile is compatible with the R4RS standard but provides a lot of GNU extensions. The GNU extensions are so extensive that it is possible to write entire applications in Guile. Most of the low-level facilities that are available in C, are also available in Guile.

What makes the Guile implementation of Scheme special is not the extensions themselves, but the fact that it it is very easy for any developer to add their own extensions to Guile, by implementing them in C. By combining C and Guile you leverage the advantages of both compiled and interpreted languages. Performance critical functionality can be implemented in C and higher-level software development can be done in Guile. Also, because Guile is interpreted, when you make your C code available through an extended Guile interpreter, then the user can also use the functionality of that code interactively through the interpreter.

The idea of extensible interpreted languages is not new. Other examples of extensible interpreted languages are Perl, Python and Tcl. What sets Guile apart from these languages is the elegance of Scheme. Scheme is the holy grail in the quest for a programming language that can be extended to support any programming paradigm by using the least amount of syntax. Scheme has natural support for both arbitrary precision integer arithmetic and floating point arithmetic. The simplicity of Scheme syntax, and the completeness of Guile, make it very easy to implement specialized scripting languages simply by translating them to Scheme. In Scheme algorithms and data are interchangeable. As a result, it is easy to write Scheme programs that manipulate Scheme source code. This makes Scheme an ideal language for writing programs that manipulate algorithms instead of data, such as programs that do symbolic algebra. Because Scheme can manipulate its own source code, a Scheme program can save its state by writing Scheme source code into a file, and by parsing it later to load it back up again. This feature alone is one reason why engineers should use Guile to configure and drive numerical simulations.

Some people like to use Fortran 77. This is in many ways a good language for developing the computational core of scientific applications. We do have free compilers for Fortran 77, so using it does not restrict our freedom. (see Using Fortran effectively) Also, Fortran 77 is an aggressively optimized language, and this makes it very attractive to engineers that want to write code optimized for speed. Unfortunately, Fortran 77 can not do well anything except array-oriented numerical computations. Managing input/output is unnecessarily difficult with Fortran, and there's even computational areas, such as infinite precision integer arithmetic and symbolic computation that are not supported.

There are many variants of Fortran like Fortran 90, and HPF. Fortran 90 attempts, quite miserably, to make Fortran 77 more like C++. HPF allows engineers to write numerical code that runs on parallel computers. These variants should be avoided for two reasons:

There are no free compilers for Fortran 90 or HPF. If you happen to use a proprietary operating system, you might as well make use of proprietary compilers if they generate highly optimized code and that is important to you. Nevertheless, in order for your software to be freed, it should be possible to compile it with free tools on a free operating system. Because it is possible to make parallel computers using GNU/Linux (see the Beowulf project), parallelized software can also be free. Therefore both Fortran 90 and HPF should be avoided. Another problem with these variants is that they are ad hoc languages that have been invented to enable Fortran to do things that it can not do by design. Eventually, when engineers will like to do things that Fortran 90 can't do either, it will be necessary to extend Fortran again, rewrite the compilers and produce yet another variant. What engineers need is a programming language that has the ability to self extend itself by writing software in the same programming language. The C++ programming language can do this without loss of performance. The departmentalization of disciplines in academia has made it very difficult for such a project to take off. Despite that, there is ongoing research in this area. (for example, see the Blitz++ project)

If you have written a program entirely in Fortran, please do not ask anyone else to maintain your code, unless person is like you and also knows only Fortran. If Fortran is the only language that you know, then please learn at least C and C++ and use Fortran only when necessary. Please do not hold the opinion that contributions in science and engineering are “true” contributions and software development is just a “tool”. This bigoted attitude is behind the thousands of lines of ugly unmaintainable code that goes around in many places. Good software development can be an important contribution in its own right, and regardless of what your goals are, please appreciate it and encourage it. To maximize the benefits of good software, please make your software free. (FIXME: Cross reference copyright section in this chapter)

2.3 Developing libraries

The key to better code is to focus away from developing monolithic throw-away hacks that do only one job, and focus on developing libraries (FIXME: cross reference). Break down the original problem to parts, and the parts to smaller parts, until you get down to simple subproblems that can be easily tested, and from which you can construct solutions for both the original problem and future variants. Every library that you write is a legacy that you can share with other developers, that want to solve similar problems. Each library will allow these other developers to focus on their problem and not have to reinvent the parts that are common with your work from scratch. You should definitely make libraries out of subproblems that are likely to be broadly useful. Please be very liberal in what you consider “broadly useful”. Please program in a defensive way that renders reusable as much code as possible, regardless of whether or not you plan to reuse it in the near future. The final application should merely have to assemble all the libraries together and make their functionality accessible to the user through a good interface.

It is very important for each of your libraries to have a complete test suite . The purpose of the test suite is to detect bugs in the library and to prove to you or convince you, the developer, that the library works. A test suite is composed of a collection of test programs that link with your libraries and experiment with the features provided by the library. These test programs should return with

exit(0);

if they do not detect anything wrong with the library and with

exit(1);

if they detect problems. The test programs should not be installed with the rest of the package. They are meant to be run after your software is compiled and before it is installed. Therefore, they should be written so that they can run using the compiled but uninstalled files of the library. Test programs should not output messages by default. They should run completely quietly and communicate with the environment in a yes or no fashion using the exit code. However, it is useful for test programs to output debugging information when they fail during development. Statements that output such information should be surrounded by conditional directives like this:

#if INSPECT_ERRORS printf("Division by zero: %d / %d

",a,b); #endif

This way it becomes easy to switch them on or off upon demand. The preferred way to manipulate a macro like this INSPECT_ERRORS is by adding a switch to your configure script. You can do this by adding the following lines to configure.in :

AC_ARG_WITH(inspect, [ --with-inspect Inspect test suite errors], [ AC_DEFINE(INSPECT_ERRORS, 1, "Inspect test suite errors")], [ AC_DEFINE(INSPECT_ERRORS, 0, "Inspect test suite errors")])

After the library is debugged, the debug statements should not be removed. If a future version of the library regresses and an old test begins to fail again, it will be useful to be able to reactivate the same error messages that were useful in debugging the test when it was first put together, and it may be necessary to add a few new ones.

The best time to write each test program is as soon as it is possible!. You should not be lazy, and you should not just keep throwing in code after code after code. The minute there is enough code in there to put together some kind of test program, just do it! When you write new code, it is easy to think that you are producing work with every new line of code that is written. The reality is that you know you have produced new work every time you write working a test program for new features, and not a minute before. Another time when you should definitely write a test program is when you find a bug while ordinarily using the library. Then, write a test program that triggers the bug, fix the bug, and keep the test in your test suite. This way, if a future modification reintroduces the same bug it will be detected.

Please document your library as you go. The best time to update your documentation is immediately after you get new test programs checking out new futures. You might feel that you are too busy to write documentation, but the truth of the matter is that you will always be too busy. In fact, if you are a busy person, you are likely to have many other obligations bugging you around for your attention. There may be times that you have to stay away from a project for a large amount of time. If you have consistently been maintaining documentation, it will help you refocus on your project even after many months of absence.

2.4 Developing applications

Applications are complete executable programs that can be run by the end-user. With library-oriented development the actual functionality is developed by writing libraries and debugged by developing test-suites for each library. With command-line oriented applications, the application source code parses the arguments that are passed to it by the user, and calls up the right functions in the library to carry out the user's requests. With GUI 1 applications, the application source code creates the widgets that compose the interface, binds them to actions , and then enters an event loop. Each action is implemented in terms of the functionality provided by the appropriate library.

It should be possible to implement applications by using relatively few application-specific source files, since most of the functionality is actually done in libraries. In some cases, the application is simple enough that it would be an overkill to package its functionality as a library. Nevertheless, in such cases please separate the source code that handles actual functionality from the source code that handles the user interface. Also, please always separate the code that handles input/output with the code that does actual computations. If these aspects of your source code are sufficiently separated then you make it easier for other people to reuse parts of your code in their applications. You also make it easier of yourself to switch to library-oriented development when your application grows and is no longer “simple enough”.

Library-oriented development allows you to write good and robust applications. In return it requires discipline. Sometimes you may need to add experimental functionality that is not available through your libraries. The right thing to do is to extend the appropriate library. The easy thing to do is to implement it as part of your application-specific source code. If the feature is experimental and undergoing many changes, it may be best to go with the easy approach at first. Still, when the feature matures, please migrate it to the appropriate library, document it, and take it out of the application source code. What we mean by discipline is doing these migrations, when the time is right, despite pressures from “real life”, such as deadlines, pointy-haired bosses, and nuclear terrorism. A rule of thumb for deciding when to migrate code to a library is when you find yourself cut-n-pasting chunks of code from application to application. If you do not do the right thing, your code will become increasingly harder to debug, harder to maintain, and less reliable.

Applications should also be documented, especially the ones that are command-line oriented. Application documentation should be thorough in explaining to the user all the things that he needs to know to use the application effectively and should be distributed separately from the application itself. Nevertheless, applications should recognize the --help switch and output a synopsis of how the application is used. Applications should also recognize the --version switch and state their version number. The easiest way to make applications understand these two switches is to use the GNU Argp library (FIXME: cross reference).

2.5 Free software is good software

One of the reasons why you should write good code is because it allows you to make your code robust, reliable and most useful to your needs. Another reason is to make it useful to other people too, and make it easier for them to work with your code and reuse it for their own work. In order for this to be possible, you need to give worry about a few obnoxious legal issues.

2.6 Invoking the ‘ gpl ’ utility

Maintaining these legalese notices can be quite painful after some time. To ease the burden, Autotools distributes a utility called ‘ gpl ’. This utility will conveniently generate for you all the legal wording you will ever want to use. It is important to know that this application is not approved in any way by the Free Software Foundation. By this I mean that I haven't asked their opinion of it yet.

To create the file COPYING type:

% gpl -l COPYING

If you want to include a copy of the GPL in your documentation, you can generate a copy in texinfo format like this:

% gpl -lt gpl.texi

Also, every time you want to create a new file, use the ‘ gpl ’ to generate the copyright notice. If you want it covered by the GPL use the standard notice . If you want to invoke the Guile-like permissions, then also use the library notice . If you want to grant unlimited permissions, meaning no copyleft, use the special notice . The ‘ gpl ’ utility takes many different flags to take into account the different commenting conventions.

For a C file, create the standard notice with % gpl -c file.c the library notice with % gpl -cL file.c and the special notice with % gpl -cS file.c

For a C++ file, create the standard notice with % gpl -cc file.cc the library notice with % gpl -ccL file.cc and the special notice with % gpl -ccS file.cc

For a shells script (BASH, Perl) that uses hash marks for commenting, create the standard notice with % gpl -sh foo.pl the library notice with % gpl -shL foo.tcl and the special notice with % gpl -shS foo.pl It does not make sense to use the library notice, if no executable is being formed from this file. If however, you parse that file into C code that is then compiled into object code, then you may consider using the library notice on it instead of the special notice. One of the features provided by Autotools allows you to embed text, such as Tcl scripts, into the executable. In that case, you can use the library notice to license the original text.

For files that define autoconf macros: % gpl -m4 file.m4 In general, we exempt autoconf macro files from the GNU GPL because the terms of autoconf also exclude its output, the ‘ configure ’ script, from the GPL.

For Makefile.am , or files that describe targets: % gpl -am Makefile.am For these we also exempt them from the GPL because they are so trivial that it makes no sense to add copyleft protection.

2.7 Inserting notices with Emacs

If you are using GNU Emacs, then you can insert these copyright notices on-demand while you're editing your source code. Autotools bundles two Emacs packages: gpl and gpl-copying which provide you with equivalents of the ‘ gpl ’ command that can be run under Emacs. These packages will be byte-compiled and installed automatically for you while installing Autotools.

To use these packages, in your .emacs you must declare your identity by adding the following commands:

(setq user-mail-address "me@here.com") (setq user-full-name "My Name")

Then you must require the packages to be loaded:

(require 'gpl) (require 'gpl-copying)

These packages introduce a set of Emacs commands all of which are prefixed as gpl- . To invoke any of these commands press M-x , type the name of the command and press enter.

The following commands will generate notices for your source code:

‘ gpl-c ’ Insert the standard GPL copyright notice using C commenting.

‘ gpl-cL ’ lnsert the standard GPL copyright notice using C commenting, followed by a Guile-like library exception. This notice is used by the Guile library. You may want to use it for libraries that you write that implement some type of a standard that you wish to encourage. You will be prompted for the name of your package.

‘ gpl-cc ’ Insert the standard GPL copyright notice using C++ commenting.

‘ gpl-ccL ’ Insert the standard GPL copyright notice using C++ commenting, followed by a Guile-like library exception. You will be prompted for the name of your package

‘ gpl-sh ’ Insert the standard GPL copyright notice using shell commenting (i.e. has marks).

‘ gpl-shL ’ Insert the standard GPL copyright notice using shell commenting, followed by a Guile-like library exception. This can be useful for source files, like Tcl files, which are executable code that gets linked in to form an executable, and which use hash marks for commenting.

‘ gpl-shS ’ Insert the standard GPL notice using shell commenting, followed by the special Autoconf exception. This is useful for small shell scripts that are distributed as part of a build system.

‘ gpl-m4 ’ Insert the standard GPL copyright notice using m4 commenting (i.e. dnl) and the special Autoconf exception. This is the preferred notice for new Autoconf macros.

‘ gpl-el ’ Insert the standard GPL copyright notice using Elisp commenting. This is useful for writing Emacs extension files in Elisp.

‘ gpl-insert-copying-texinfo ’ Insert a set of paragraphs very similar to the ones appearing at the Copying section of this manual. It is a good idea to include this notice in an unnumbered chapter titled “Copying” in the Texinfo documentation of your source code. You will be prompted for the title of your package. That title will substitute the word Autotools as it appears in the corresponding section in this manual.

‘ gpl-insert-license-texinfo ’ Insert the full text of the GNU General Public License in Texinfo format. If your documentation is very extensive, it may be a good idea to include this notice either at the very beginning of your manual, or at the end. You should include the full license, if you plan to distribute the manual separately from the package as a printed book.

3 Using GNU Emacs

Emacs is an environment for running Lisp programs that manipulate text interactively. To call Emacs merely an editor does not do it justice, unless you redefine the word “editor” to the broadest meaning possible. Emacs is so extensive, powerful and flexible, that you can almost think of it as a self-contained “operating system” in its own right.

Emacs is a very important part of the GNU development tools because it provides an integrated environment for software development. The simplest thing you can do with Emacs is edit your source code. However, you can do a lot more than that. You can run a debugger, and step through your program while Emacs shows you the corresponding sources that you are stepping through. You can browse on-line Info documentation and man pages, download and read your email off-line, and follow discussions on newsgroups. Emacs is particularly helpful with writing documentation with the Texinfo documentation system. You will find it harder to use Texinfo, if you don't use Emacs. It is also very helpful with editing files on remote machines over FTP, especially when your connection to the internet is over a slow modem. Finally, and most importantly, Emacs is programmable. You can write Emacs functions in Emacs Lisp to automate any chore that you find particularly useful in your own work. Because Emacs Lisp is a full programming language, there is no practical limit to what you can do with it.

If you already know a lot about Emacs, you can skip this chapter and move on. If you are a “vi” user, then we will assimilate you: See Using vi emulation, for details. 2 This chapter will be most useful to the novice user who would like to set per Emacs up and running for software development, however it is not by any means comprehensive. See Further reading on Emacs, for references to more comprehensive Emacs documentation.

3.1 Introduction to Emacs

Emacs is an environment for running Lisp programs that manipulate text interactively. Because Emacs is completely programmable, it can be used to implement not only editors, but a full integrated development environment for software development. Emacs can also browse info documentation, run email clients, a newsgroup reader, a sophisticated xterm, and an understanding psychotherapist.

Under the X window system, Emacs controls multiple x-windows called frames . Each frame has a menu bar and the main editing area. The editing area is divided into windows with horizontal bars. You can grab these bars and move them around with the first mouse button. 3 Each window is bound to a buffer . A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you have on your screen.

A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press <Enter> while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing <Enter> he “opened” the file for editing. If the file has already been “opened” then Emacs simply rebinds the existing buffer for that file.

Emacs uses a variant of LISP, called Emacs LISP, as its programming language. Every time you press a key, click the mouse, or select an entry from the menu bar, an Emacs LISP function is evaluated. The mode of the buffer determines, among many other things, what function to evaluate. This way, every buffer can be associated with functionality that defines what you do in that buffer. For example you can program your buffer to edit text, to edit source code, to read news, and so on. You can also run LISP functions directly on the current buffer by typing M-x and the name of the function that you want to run. 4

What is known as the “Emacs editor” is the default implementation of an editor under the Emacs system. If you prefer the vi editor, then you can instead run a vi clone, Viper (see Using vi emulation). The main reason why you should use Emacs, is not the particular editor, but the way Emacs integrates editing with all the other functions that you like to do as a software developer. For example:

You can edit multiple files under one program. From the user perspective, you can edit two different parts of a file under two different x-windows. And when you revisit a file, the cursor is placed where it was the last time you were editing the file.

You can quickly browse a directory and navigate from file to file. You can also do simple operations on files, without needing to go to a shell.

You can transparently edit files over FTP. This is extremely valuable if you are editing source code on a remote computer and you are connected through a modem link.

You can have a running shell for typing unix commands, and access the same shell from any Emacs frame. You can use that shell to run ‘ reconf ’, ‘ configure ’ and ‘ make ’. You can also save the contents of your session to a file.

’, ‘ ’ and ‘ ’. You can also save the contents of your session to a file. Color is used to highlight syntactic information about the text. This makes browsing more pleasing to the eye, and it can also help you catch syntactic mistakes. Emacs understands the syntax of most types of files you are likely to edit and will color them up for you accordingly.

When you edit source code under Emacs, it will automatically be formatted for you to conform to the GNU coding standards. At your request, appropriate copyright notices can be inserted. (see Inserting notices with Emacs)

When you make changes to a file, Emacs can automatically warp you to the appropriate ChangeLog file to record your changes. It will handle formatting details for you allowing you to focus on content. (see Maintaining the documentation files)

file to record your changes. It will handle formatting details for you allowing you to focus on content. (see Maintaining the documentation files) Emacs is invaluably helpful for writing Texinfo documentation. In fact, it is excruciatingly painful to maintain Texinfo documentation without using Emacs. (see GNU Emacs support for Texinfo)

You can run the gdb debugger under Emacs and use it to step through your code. As you do that, Emacs will show you on a separate buffer the code that is currently being stepped through.

debugger under Emacs and use it to step through your code. As you do that, Emacs will show you on a separate buffer the code that is currently being stepped through. You can read email and newsgroups. If you are connected over a modem, all your editing is done locally, so you do not get bogged down by the speed of your connection. You can apply patches that you get through email or news to your source code directly, without needing to save the message to a file.

Emacs currently supports almost every international language, even languages that do not use the Roman alphabet, like Greek, Chinese, Hebrew, Tibetan,etc.

All of these features make Emacs a very powerful, albeit unusual, integrated development environment. Many users of proprietary operating systems, like Lose95 5, complain that GNU (and Unix) does not have an integrated development environment. As a matter of fact it does. All of the above features make Emacs a very powerful IDE.

Emacs has its own very extensive documentation (see Further reading on Emacs). In this manual we will only go over the fundamentals for using Emacs effectively as an integrated development environment.

3.2 Installing GNU Emacs

If Emacs is not installed on your system, you will need to get a source code distribution and compile it yourself. Installing Emacs is not difficult. If Emacs is already installed on your GNU/Linux system, you might still need to reinstall it: you might not have the most recent version, you might have XEmacs instead, you might not have support for internationalization, or your Emacs might not have compiled support for reading mail over POP (a feature very useful to developers that hook up over modem). If any of these is the case, then uninstall that version of Emacs, and reinstall Emacs from a source code distribution.

The entire Emacs source code is distributed in three separate files:

emacs-21.2.tar.gz This is the main Emacs distribution. If you do not care about international language support, you can install this by itself.

leim-21.2.tar.gz This supplements the Emacs distribution with support for multiple languages. If you develop internationalized software, it is likely that you will need this.

intlfonts-1.1.tar.gz This file contains the fonts that Emacs uses to support international languages. If you want international language support, you will definitely need this.

% gunzip emacs-21.2.tar.gz % tar xf emacs-21.2.tar % gunzip leim-21.2.tar.gz % tar xf leim-21.2.tar

Both tarballs will unpack under the emacs-21.2 directory. When this is finished, configure the source code with the following commands:

% cd emacs-21.2 % ./configure --with-pop --with-gssapi % make

The ‘ --with-pop ’ flag is almost always a good idea, especially if you are running Emacs from a home computer that is connected to the internet over modem. It will let you use Emacs to download your email from your internet provider and read it off-line (see Using Emacs as an email client). Most internet providers use GSSAPI-authenticated POP. If you need to support other authentication protocols however, you may also want to add one of the following flags:

--with-kerberos support Kerberos-authenticated POP

--with-kerberos5 support Kerberos version 5 authenticated POP

--with-hesiod support Hesiod to get the POP server host

$ make # make install

Emacs is a very large program, so this will take a while.

To install intlfonts-1.1.tar.gz unpack it, and follow the instructions in the README file. Alternatively, you may find it more straightforward to install it from a Debian package. Packages for intlfonts exist as of Debian 2.1.

3.3 Basic Emacs concepts

In this section we describe what Emacs is and what it does. We will not yet discuss how to make Emacs work. That discussion is taken up in the subsequent sections, starting with Configuring GNU Emacs. This section instead covers the fundamental ideas that you need to understand in order to make sense out of Emacs.

You can run Emacs from a text terminal, such as a vt100 terminal, but it is usually nicer to run Emacs under the X-windows system. To start Emacs type

% emacs &

on your shell prompt. The seasoned GNU developer usually sets up per X configuration such that it starts Emacs when person logs in. Then, person uses that Emacs process for all of per work until person logs out. To quit Emacs press C-x C-c , or select

Files ==> Exit Emacs

from the menu. The notation C-c means <CTRL>-c . The separating dash ‘ - ’ means that you press the key after the dash while holding down the key before the dash. Be sure to quit Emacs before logging out, to ensure that your work is properly saved. If there are any files that you haven't yet saved, Emacs will prompt you and ask you if you want to save them, before quiting. If at any time you want Emacs to stop doing what it's doing, press C-g .

Under the X window system, Emacs controls multiple x-windows which are called frames . Each frame has a menu bar and the main editing area. The editing area is divided into windows 6 by horizontal bars, called status bars . Every status bar contains concise information about the status of the window above the status bar. The minimal editing area has at least one big window, where editing takes place, and a small one-line window called the minibuffer . Emacs uses the minibuffer to display brief messages and to prompt the user to enter commands or other input. The minibuffer has no status bar of its own.

Each window is bound to a buffer . A buffer is an Emacs data structure that contains text. Most editing commands operate on buffers, modifying their contents. When a buffer is bound to a window, then you can see its contents as they are being changed. It is possible for a buffer to be bound to two windows, on different frames or on the same frame. Then whenever a change is made to the buffer, it is reflected on both windows. It is not necessary for a buffer to be bound to a window, in order to operate on it. In a typical Emacs session you may be manipulating more buffers than the windows that you actually have on your screen.

A buffer can be visiting files. In that case, the contents of the buffer reflect the contents of a file that is being edited. But buffers can be associated with anything you like, so long as you program it up. For example, under the Dired directory editor, a buffer is bound to a directory, showing you the contents of the directory. When you press <RET> while the cursor is over a file name, Emacs creates a new buffer, visits the file, and rebinds the window with that buffer. From the user's perspective, by pressing <RET> person “opened” the file for editing. If the file has already been “opened” then Emacs simply rebinds the existing buffer for that file.

Sometimes Emacs will divide a frame to two or more windows. You can switch from one window to another by clicking the 1st mouse button, while the mouse is inside the destination window. To resize these windows, grab the status bar with the 1st mouse button and move it up or down. Pressing the 2nd mouse button, while the mouse is on a status bar, will bury the window bellow the status bar. Pressing the 3rd mouse button will bury the window above the status bar, instead. Buried windows are not killed; they still exist and you can get back to them by selecting them from the menu bar, under:

Buffers ==> name-of-buffer

Buffers, with some exceptions, are usually named after the filenames of the files that they correspond to.

Once you visit a file for editing, then all you need to do is to edit it! The best way to learn how to edit files using the standard Emacs editor is by working through the on-line Emacs tutorial. To start the on-line tutorial type C-h t or select:

Help ==> Emacs Tutorial

If you are a vi user, or you simply prefer to use `vi' key bindings, then read Using vi emulation.

In Emacs, every event causes a Lisp function to be executed. An event can be any keystroke, mouse movement, mouse clicking or dragging, or a menu bar selection. The function implements the appropriate response to the event. Almost all of these functions are written in a variant of Lisp called Emacs Lisp. The actual Emacs program, the executable, is an Emacs Lisp interpreter with the implementation of frames, buffers, and so on. However, the actual functionality that makes Emacs usable is implemented in Emacs Lisp.

Sometimes, Emacs will bind a few words of text to an Emacs function. For example, when you use Emacs to browse Info documentation, certain words that corresponds to hyperlinks to other nodes are bound to a function that makes Emacs follow the hyperlink. When such a binding is actually installed, moving the mouse over the bound text highlights it momentarily. While the text is highlighted, you can invoke the binding by clicking the 2nd mouse button.

Sometimes, an Emacs function might go into an infinite loop, or it might start doing something that you want to stop. You can always make Emacs abort 7 the function it is currently running by pressing C-g .

Emacs functions are usually spawned by Emacs itself in response to an event. However, the user can also spawn an Emacs function by typing:

<ALT>-x function-name <RET>

These functions can also be aborted with C-g .

It is standard in Emacs documentation to refer to the <ALT> key with the letter ‘ M ’. So, in the future, we will be referring to function invocations as:

M-x function-name

Because Emacs functionality is implemented in an event-driven fashion, the Emacs developer has to write Lisp functions that implement functionality, and then bind these functions to events. Tables of such bindings are called keymaps .

Emacs has a global keymap , which is in effect at all times, and then it has specialized keymaps depending on what editing mode you use. Editing modes are selected when you visit a file depending on the name of the file. So, for example, if you visit a C file, Emacs goes into the C mode. If you visit Makefile , Emacs goes into makefile mode. The reason for associating different modes with different types of files is that the user's editing needs depend on the type of file that person is editing.

You can also enter a mode by running the Emacs function that initializes the mode. Here are the most commonly used modes:

M-x c-mode Mode for editing C programs according to the GNU coding standards.

M-x c++-mode Mode for editing C++ programs

M-x sh-mode Mode for editing shell scripts.

M-x m4-mode Mode for editing Autoconf macros.

M-x texinfo-mode Mode for editing documentation written in the Texinfo formatting language. See Introduction to Texinfo.

M-x makefile-mode Mode for editing makefiles.

3.4 Configuring GNU Emacs

To use Emacs effectively for software development you need to configure it. Part of the configuration needs to be done in your X-resources file. On a Debian GNU/Linux system, the X-resources can be configured by editing

/etc/X11/Xresources

In many systems, you can configure X-resources by editing a file called .Xresources or .Xdefaults on your home directory, but that is system-dependent. The configuration that I use on my system is:

! Emacs defaults emacs*Background: Black emacs*Foreground: White emacs*pointerColor: White emacs*cursorColor: White emacs*bitmapIcon: on emacs*font: fixed emacs*geometry: 80x40

In general I favor dark backgrounds and ‘ fixed ’ fonts. Dark backgrounds make it easier to sit in front of the monitor for a prolonged period of time. ‘ fixed ’ fonts looks nice and it's small enough to make efficient use of your screen space. Some people might prefer larger fonts however.

When Emacs starts up, it looks for a file called .emacs at the user's home directory, and evaluates its contents through the Emacs Lisp interpreter. You can customize and modify Emacs' behaviour by adding commands, written in Emacs Lisp, to this file. Here's a brief outline of the ways in which you can customize Emacs:

A common change to the standard configuration is assigning global variables to non-default values. Many Emacs features and behaviours can be controlled and customized this way. This is done with the ‘ setq ’ command, which accepts the following syntax: (setq variable value ) For example: (setq viper-mode t) You can access on-line documentation for global variables by running: M-x describe-variable In some cases, Emacs depends on the values of shell environment variables . These can be manipulated with ‘ setenv ’: (setenv " variable " " value ") For example: (setenv "INFOPATH" "/usr/info:/usr/local/info") ‘ setenv ’ does not affect the shell that invoked Emacs, but it does affect Emacs itself, and shells that are run under Emacs. Another way to enhance your Emacs configuration is by modifying the global keymap. This can be done with the ‘ global-set-key ’ command, which follows the following syntax: (global-set-key [ key sequence ] ' function ) For example, adding: (global-set-key [F12 d] 'doctor) to .emacs makes the key sequence F12 d equivalent to running ‘ M-x doctor ’. Emacs has many functions that provide all sorts of features. To find out about specific functions, consult the Emacs user manual . Once you know that a function exists, you can also get on-line documentation for it by running: M-x describe-function You can also write your own functions in Emacs Lisp. It is not always good to introduce bindings to the global map. Any bindings that are useful only within a certain mode should be added only to the local keymap of that mode. Consider for example the following Emacs Lisp function: (defun texi-insert-@example () "Insert an @example @end example block" (interactive) (beginning-of-line) (insert "

@example

") (save-excursion (insert "

") (insert "@end example

") (insert "

@noindent

"))) We would like to bind this function to the key ‘ F9 ’, however we would like this binding to be in effect only when we are within ‘ texinfo-mode ’. To do that, first we must define a hook function that establishes the local bindings using ‘ define-key ’: (defun texinfo-elef-hook () (define-key texinfo-mode-map [F9] 'texi-insert-@example)) The syntax of ‘ define-key ’ is similar to ‘ global-set-key ’ except it takes the name of the local keymap as an additional argument. The local keymap of any ‘ name -mode ’ is ‘ name -mode-map ’. Finally, we must ask ‘ texinfo-mode ’ to call the function ‘ texinfo-elef-hook ’. To do that use the ‘ add-hook ’ command: (add-hook 'texinfo-mode-hook 'texinfo-elef-hook) In some cases, Emacs itself will provide you with a few optional hooks that you can attach to your modes. You can write your own modes! If you write a program whose use involves editing some type of input files, it is very much appreciated by the community if you also write an Emacs mode for that file and distribute it with your program.

With the exception of simple customizations, most of the more complicated ones require that you write new Emacs Lisp functions, distribute them with your software and somehow make them visible to the installer's Emacs when person installs your software. See Emacs Lisp with Automake, for more details on how to include Emacs Lisp packages to your software.

Here are some simple customizations that you might want to add to your .emacs file:

Set your default background and foreground color for all your Emacs frames: (set-background-color "black") (set-foreground-color "white") You can change the colors to your liking.

Tell Emacs your name and your email address. This is particularly useful when you work on an off-line home system but you want Emacs to use the email address of your internet provider, and your real name. Specifying your real name is necessary if you call yourself “Skeletor” or “Dude” on your home computer. (setq user-mail-address "karl@whitehouse.com") (setq user-full-name "President Karl Marx") Make sure the name is your real name, and the email address that you include can receive email 24 hours per day.

Add a few toys to the status bar. These commands tell Emacs to display a clock, and the line and column number of your cursor's position at all times. (display-time) (line-number-mode 1) (column-number-mode 1)

When you use the mouse to cut and paste text with Emacs, mouse button 1 will select text and mouse button 2 will paste it. Unfortunately, when you click mouse button 2, emacs will first move the cursor at the location of the mouse, and then insert the text in that location. If you are used to editing with vi under xterm, you will instead prefer to position the cursor yourself, and use mouse button 2 to simply cause the text to be pasted without changing the position of the cursor. If you prefer this behaviour, then add the following line to your .emacs : (global-set-key [mouse-2] 'yank) By default, selected text in Emacs buffers is highlighted with blue color. However, you can also select and paste into an Emacs buffer text that you select from other applications, like your web browser, or your xterm.

: Use font-lock . Font-lock decorates your edited text with colors that make it easier to read text with complicated syntax, such as software source codes. This is one of the coolest features of Emacs. To use it, add the following lines to your .emacs : (global-font-lock-mode t) (setq font-lock-maximum-size nil)

. Font-lock decorates your edited text with colors that make it easier to read text with complicated syntax, such as software source codes. This is one of the coolest features of Emacs. To use it, add the following lines to your : To get rid of the scrollbar at the left of your Emacs window, type (setq scroll-bar-mode nil) The only reason that the scrollbar is default is to make Emacs more similar to what lusers are used to. It is assumed that seasoned hacker, who will be glad to see the scrollbar bite it, will figure out how to make it go away.

With most versions of Emacs, you should add the following to your .emacs to make sure that editing configure.in takes you to m4-mode and editing Makefile.am takes you to makefile-mode . (setq auto-mode-alist (append '( ("configure.in" . m4-mode) ("\\.m4\\'" . m4-mode) ("\\.am\\'" . makefile-mode)) auto-mode-alist)) You will have to edit such files if you use the GNU build system. See The GNU build system, for more details.

to make sure that editing takes you to and editing takes you to . If you have installed Emacs packages in non-standard directories, you need to add them to the ‘ load-path ’ variable. For example, here's how to add a couple of directories: (setq load-path (append "/usr/share/emacs/site-lisp" "/usr/local/share/emacs/site-site" (expand-file-name "~lf/lisp") load-path)) Note the use of ‘ expand-file-name ’ for dealing with non-absolute directories. If you are a user in an account where you don't have root privilege, you are very likely to need to install your Emacs packages in a non-standard directory.

’ variable. For example, here's how to add a couple of directories: See Using vi emulation, if you would like to customize Emacs to run a vi editor under the Emacs system.

See Navigating source code, for more details on how to customize Emacs to make navigating a source code directory tree easier.

See Using Emacs as an email client, if you would like to set up Emacs to process your email.

Autotoolset distributes two Emacs packages. One for handling copyright notices, and another one for handling Texinfo documentation. See Inserting copyright notices with Emacs, and See GNU Emacs support for Texinfo, for more details.

.emacs

Help ==> Customize ==> Browse Customization Groups

from the menu bar. You can also manipulate some common settings from:

Help ==> Options

3.5 Using vi emulation

Many hackers prefer to use the ‘ vi ’ editor. The ‘ vi ’ editor is the standard editor on Unix. It is also always available on GNU/Linux. Many system administrators find it necessary to use vi, especially when they are in the middle of setting up a system in which Emacs has not been installed yet. Besides that, there are many compelling reasons why people like vi.

Vi requires only two special keys: the <SHIFT> key and the <ESC> key. All the other keys that you need are standard on all keyboards. You do not need <CTRL>,<ALT>,the cursor keys or any of the function keys. Some terminals that miss the escape key, usually have the control key and you can get escape with: <CTRL>-[

Vi was designed to deal with terminals that connect to mainframes over a very slow line. So it has been optimized to allow you to do the most editing possible with the fewest keystrokes. This allows users to edit text very efficiently.

Vi allows your fingers to stay at the center of the keyboard, with the occasional hop to the escape key. It does not require you to stretch your fingers in funny control combinations, which makes typing less tiring and more comfortable.

The vi emulation package for the Emacs system is called Viper . To use Viper, add the following lines in your .emacs :

(setq viper-mode t) (setq viper-inhibit-startup-message 't) (setq viper-expert-level '3) (require 'viper)

We recommend expert level 3, as the most balanced blend of the vi editor with the Emacs system. Most editing modes are aware of Viper, and when you begin editing the text you are immediately thrown into Viper. Some modes however do not do that. In some modes, like the Dired mode, this is very appropriate. In other modes however, especially custom modes that you have added to your system, Viper does not know about them, so it does not configure them to enter Viper mode by default. To tell a mode to enter Viper by default, add a line like the following to your .emacs file:

(add-hook 'm4-mode-hook 'viper-mode)

The modes that you are most likely to use during software development are

c-mode , c++-mode , texinfo-mode sh-mode , m4-mode , makefile-mode

Sometimes, Emacs will enter Viper mode by default in modes where you prefer to get Emacs modes. In some versions of Emacs, the compilation-mode is such a mode. To tell a mode not to enter Viper by default, add a line like the following to your .emacs file:

(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)

The Emacs distribution has a Viper manual. For more details on setting Viper up, you should read that manual.

The vi editor has these things called editing modes. An editing mode defines how the editor responds to your keystrokes. Vi has three editing modes: insert mode , replace mode and command mode . If you run Viper, there is also the Emacs mode. Emacs indicates which mode you are in by showing one of ‘ <I> ’, ‘ <R> ’, ‘ <V> ’, ‘ <E> ’ on the statusbar correspondingly for the Insert, Replace, Command and Emacs modes. Emacs also shows you the mode by the color of the cursor. This makes it easy for you to keep track of which mode you are in.

Insert mode : When you are in insert mode, the editor simply inserts the things that you type into the text that is being edited. If there are any characters in front of your cursor, these characters are pushed ahead and they are not overwritten. Under Viper, when you are in insert mode, the color of your cursor is green. The only key that has special meaning, while you are in insert mode is <ESC>. If you press the escape key, you are taken to command mode .

: When you are in insert mode, the editor simply the things that you type into the text that is being edited. If there are any characters in front of your cursor, these characters are pushed ahead and they are not overwritten. Under Viper, when you are in insert mode, the color of your cursor is green. The only key that has special meaning, while you are in insert mode is <ESC>. If you press the escape key, you are taken to . Replace mode : When you are in replace mode, the editor replaces the text under the cursor with the text that is being typed. So, you want insert mode when you want to write over what's already written. Under Viper, when you are in replace mode, the color of your cursor is red. The <ESC> will take you to command mode.

: When you are in replace mode, the editor replaces the text under the cursor with the text that is being typed. So, you want insert mode when you want to write over what's already written. Under Viper, when you are in replace mode, the color of your cursor is red. The <ESC> will take you to Command mode : When you are in command mode, every letter key that you press is a command and has a special meaning. Some of these keys allow you to navigate the text. Other keys allow you to enter either insert or replace mode. And other keys do various special things. Under Viper, when you are in command mode, the color of your cursor is white.

: When you are in command mode, every letter key that you press is a command and has a special meaning. Some of these keys allow you to navigate the text. Other keys allow you to enter either insert or replace mode. And other keys do various special things. Under Viper, when you are in command mode, the color of your cursor is white. Emacs mode : When you are in Emacs mode, then Viper is turned off on the specific buffer, and Emacs behaves as the default Emacs editor. You can switch between Emacs mode and Command mode by pressing <CTRL>-z . So to go to Emacs mode, from Insert of Replace mode, you need to go through Command mode. When you are dealing with a buffer that runs a special editing mode, like Dired, Emacs defines a specialized “command mode” for manipulating that buffer, that can be completely different from the canonical Viper command mode. You want to be in that mode to access the intended functionality. Occasionally however, you may like to hop to viper's command mode to navigate the buffer, do a search or save the buffer's contents. When you hop to one of the other three modes, the buffer will suddenly be just text to your editor.

The following keystrokes allow you to navigate the cursor around your text without making any changes on the text itself h moves one character to the left

j moves down one line

k moves up one line

l moves one character to the left

w moves forward one word

5w moves forward five words (get the idea?)

b moves back one word

0 moves to the beginning of the current line

$ moves to the end of the current line

G moves to the last line in the file

1G moves to the first line in the file

:10 moves to line 10 in the file (get the idea?)

{ moves up one paragraph

} moves down one paragraph

The following keystrokes allow you to delete text x Deletes the character under the cursor

dd Deletes the current line

4dd Deletes four lines

dw Deletes the current word

8dw Deletes the next eight words

The following keystrokes allow you to enter Insert mode a Append text after the cursor position

i Insert text at the current cursor position

o Insert text on a new line bellow the current line

O Insert text on a new line above the current line

The following keystrokes allow you to enter Replace mode. R Replace text at the cursor position and stay in Replace mode.

s Replace (substitute) only the character at the cursor position, and enter Insert mode for all subsequent characters.

The following commands handle file input/output. All of these commands are prepended by the : character. The : character is used for commands that require many characters to be properly expressed. The full text of these commands is entered in the minibuffer. Under viper, the minibuffer itself can run under insert, replace and command mode. By default you get insert mode, but you can switch to command mode by pressing <ESC>. :w Save the file to the disk

:w! Force the file to be saved to disk even when file permissions do not allow it but you have the power to overrule the permissions.

:w filename <RET> Save the file to the disk under a specific filename. When you press <SPACE> Emacs inserts the full pathname of the current directory for you, which you can edit if you like.

:w! filename <RET> Force the file to be saved to the disk under a specific filename.

:r filename <RET> Paste a file from the disk at the cursor's current position.

:W Save all the files on all the Emacs buffers that correspond to open files.

:q Kill the buffer. This does not quite the editor at expert level 3.

:q! Kill the buffer even if the contents are not saved. Use with caution!

character. The character is used for commands that require many characters to be properly expressed. The full text of these commands is entered in the minibuffer. Under viper, the minibuffer itself can run under insert, replace and command mode. By default you get insert mode, but you can switch to command mode by pressing <ESC>. The following commands handle search and replace / string <RET> Search for string .

n Go to the next occurence of string .

N Go to the previous occurence of string .

:%s/ string1 / string2 /g <RET> Replace all occurences of string1 with string2 . Use this with extreme caution!

The following commands handle undo u Undo the previous change. Press again to undo the undo

. Press this if you want to repeat the undo further.

3.6 Navigating source code

When you develop software, you need to edit many files at the same time, and you need an efficient way to switch from one file to another. The most general solution in Emacs is by going through Dired , the Emacs Directory Editor.

To use Dired effectively, we recommend that you add the following customizations to your .emacs file: First, add

(add-hook 'dired-load-hook (function (lambda () (load "dired-x")))) (setq dired-omit-files-p t)

to activate the extended features of Dired . Then add the following key-bindings to the global keymap:

(global-set-key [f1] 'dired) (global-set-key [f2] 'dired-omit-toggle) (global-set-key [f3] 'shell) (global-set-key [f4] 'find-file) (global-set-key [f5] 'compile) (global-set-key [f6] 'visit-tags-table) (global-set-key [f8] 'add-change-log-entry-other-window) (global-set-key [f12] 'make-frame)

If you use viper (see Using vi emulation), you should also add the following customization to your .emacs :

(add-hook 'compilation-mode-hook 'viper-change-state-to-emacs)

With these bindings, you can navigate from file to file or switch between editing and the shell simply by pressing the right function keys. Here's what these key bindings do:

f1 Enter the directory editor.

f2 Toggle the omission of boring files.

f3 Get a shell at the current Emacs window.

f4 Jump to a file, by filename.

f5 Run a compilation job.

f6 Load a TAGS file.

f8 Update the ChangeLog file.

f12 Make a new frame.

f12

f1

f4

To go down a directory, move the cursor over the directory filename and press RET . To go up a few directories, press f1 and when you are prompted for the new directory, with the current directory as the default choice, erase your way up the hierarchy and press <RET>. To jump to a substantially different directory that you have visited recently, press f1 and then when prompted for the destination directory name, use the cursor keys to select the directory that you want among the list of directories that you have recently visited.

While in the directory navigator, you can use the cursor keys to move to another file. Pressing <<RET>> will bring that file up for editing. However there are many other things that Dired will let you do instead:

Z Compress the file. If already compressed, uncompress it.

L Parse the file through the Emacs Lisp interpreter. Use this only on files that contain Emacs Lisp code.

I, N Visit the current file as an Info file, or as a man page . See Browsing documentation.

d Mark the file for deletion

u Remove a mark on the file for deletion

x Delete all the files marked for deletion

C destination <RET> Copy the file to destination .

R filename <RET> Rename the file to filename .

+ directoryname <RET> Create a directory with name directoryname .

GNU Emacs User Manual

Emacs provides another method for jumping from file to file: tags . Suppose that you are editing a C program whose source code is distributed in many files, and while editing the source for the function foo , you note that it is calling another function gleep . If you move your cursor on gleep , then Emacs will let you jump to the file where gleep is defined by pressing M-. . You can also jump to other occurences in your code where gleep is invoked by pressing M-, . In order for this to work, you need to do two things: you need to generate a tags file, and you need to tell emacs to load the file. If your source code is maintained with the GNU build system, you can create that tags files by typing:

% make tags

from the top-level directory of your source tree. Then load the tags file in Emacs by navigating Dired to the top-level directory of your source code, and pressing f6 .

While editing a file, you may want to hop to the shell prompt to run a program. You can do that at any time, on any frame, by pressing f3 . To get out of the shell, and back into the file that you were editing, enter the directory editor by pressing f1 , and then press <RET> repeatedly. The default selections will take you back to the file that you were most recently editing on that frame.

One very nice feature of Emacs is that it understands tar files. If you have a tar file foo.tar and you select it under Dired, then Emacs will load the entire file, parse it, and let you edit the individual files that it includes directly. This only works, however, when the tar file is not compressed. Usually tar files are distributed compressed, so you should uncompress them first with Z before entering them. Also, be careful not to load an extremely huge tar file. Emacs may mean “eating memory and constantly swapping” to some people, but don't push it!

Another very powerful feature of Emacs is the Ange-FTP package: it allows you to edit files on other computers, remotely, over an FTP connection. From a user perspective, remote files behave just like local files. All you have to do is press f1 or f4 and request a directory or file with filename following this form:

/ username @ host :/ pathname

Then Emacs will access for you the file / pathname on the remote machine host by logging in over FTP as username . You will be prompted for a password, but that will happen only once per host. Emacs will then download the file that you want to edit and let you make your changes locally. When you save your changes, Emacs will use an FTP connection again to upload the new version back to the remote machine, replacing the older version of the file there. When you develop software on a remote computer, this feature can be very useful, especially if your connection to the Net is over a slow modem line. This way you can edit remote files just like you do with local files. You will still have to telnet to the remote computer to get a shell prompt. In Emacs, you can do this with M-x telnet . An advantage to telneting under Emacs is that it records your session, and you can save it to a file to browse it later.

While you are making changes to your files, you should also be keeping a diary of these changes in a ChangeLog file (see Maintaining the documentation files). Whenever you are done with a modification that you would like to log, press f8 , while the cursor is still at the same file, and preferably near the modification (for example, if you are editing a C program, be inside the same C function). Emacs will split the frame to two windows. The new window brings up your ChangeLog file. Record your changes and click on the status bar that separates the two windows with the 2nd mouse button to get rid of the ChangeLog file. Because updating the log is a frequent chore, this Emacs help is invaluable.

If you would like to compile your program, you can use the shell prompt to run ‘ make ’. However, the Emacs way is to use the M-x compile command. Press f5 . Emacs will prompt you for the command that you would like to run. You can enter something like: ‘ configure ’, ‘ make ’, ‘ make dvi ’, and so on (see Installing a GNU package). The directory on which this command will run is the current directory of the current buffer. If your current buffer is visiting a file, then your command will run on the same directory as the file. If your current buffer is the directory editor, then your command will run on that directory. When you press <RET> , Emacs will split the frame into another window, and it will show you the command's output on that window. If there are error messages, then Emacs converts these messages to hyperlinks and you can follow them by pressing <RET> while the cursor is on them, or by clicking on them with the 2nd mouse button. When you are done, click on the status bar with the 2nd mouse button to get the compilation window off your screen.

3.7 Using Emacs as an email client

You can use Emacs to read your email. If you maintain free software, or in general maintain a very active internet life, you will get a lot of email. The Emacs mail readers have been designed to address the needs of software developers who get endless tons of email every day.

Emacs has two email programs: Rmail and Gnus. Rmail is simpler to learn, and it is similar to many other mail readers. The philosophy behind Rmail is that instead of separating messages to different folders, you attach labels to each message but leave the messages on the same folder. Then you can tell Rmail to browse only messages that have specific labels. Gnus, on the other hand, has a rather eccentric approach to email. It is a news-reader, so it makes your email look like another newsgroup! This is actually very nice if you are subscribed to many mailing lists and want to sort your email messages automatically. To learn more about Gnus, read the excellent Gnus manual. In this manual, we will only describe Rmail.

When you start Rmail, it moves any new mail from your mailboxes to the file ~/RMAIL in your home directory. So, the first thing you need to tell Rmail is where your mailboxes are. To do that, add the following to your .emacs :

(require 'rmail) (setq rmail-primary-inbox-list (list "mailbox1" "mailbox2" ...))

If your mailboxes are on a filesystem that is mounted to your computer, then you just have to list the corresponding filenames. If your mailbox is on a remote computer, then you have to use the POP protocol to download it to your own computer. In order for this to work, the