Perl Programming for Biologists


ISSN: 0368-492X

Article publication date: 1 September 2004




Andrew, A.M. (2004), "Perl Programming for Biologists", Kybernetes, Vol. 33 No. 8, pp. 1335-1336.



Emerald Group Publishing Limited

Copyright © 2004, Emerald Group Publishing Limited

This is a comprehensive guide to the use of this useful and interesting programming language. It is written in a pleasant chatty style with obvious enthusiasm for the topic. No previous knowledge of programming by the reader is assumed and the treatment begins with the “Hello world!” program that is familiar from introductions to various other languages.

Perl is a general‐purpose programming language but with many unusual features. It has special facilities for text manipulation that readily allow the parsing and transformation of data files to new formats, and that can be applied to manipulation of DNA sequences and the search for matching segments. The name originated as an acronym for: “Practical Extraction and Reporting Language”.

Other useful features of Perl include means of interfacing to a wide range of database formats, and to standard systems used in storing and searching databases of DNA sequences. In an Appendix to the book the main such systems are listed and described. The tools denoted by the acronyms BLAST (Basic Local Alignment Search Tool) and FASTA are explained and are features of, for example, the human genome site at: Perl has also become the principal language used in programming interactive Web sites.

The book has three parts, labelled, respectively, as The Basics, Intermediate Perl and Advanced Perl. The first part, and the initial chapters of the second, introduce the reader very gently to principles of general‐purpose programming in Perl, which at this stage has much in common with other languages, and is said to be very easy to learn. However, Chapter 7, in the “Intermediate” part, on Input and Output, treats very complex and unfamiliar aspects, since Perl not only accepts input and provides output through the usual channels but when running on a multi‐thread machine can even be made to interact with other programs running simultaneously.

In the “Advanced” part of the book the reader is introduced to the use of pointers, as in PASCAL, and also to Object‐Oriented Programming. Perl appears to have the nice feature that it can be used with or without OOP and this seems to imply a smoother transition for learners than in, say, going from C to C++. One deficiency, shared also by the much more primitive Javascript, is that there seems to be no facility for any kind of graphics. However, the fact that Perl can interface readily with a range of database formats (and even, as has been seen, with other programs) should mean that data can readily be passed to a separate graph‐drawing facility.

Another nice feature is that everything needed to use the language can be downloaded free from the Web site: I found that the version for Windows 98 needed about an hour of download time with the usual 56 kb/s modem. A large amount of documentation is included in the download and it would probably be possible to learn to use the language without the help of the present book, but the book is well worthwhile to give orientation and to make learning much easier and to be infected by the author's obvious enthusiasm.

Related articles