Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
Parrot - Perl's new Virtual Machine

COMPILED V INTERPRETED V VIRTUAL MACHINE LANGUAGES

Whatever programming language they use, you want your programming staff to be able to develop and maintain applications quickly and efficiently, AND you want the applications to run fast too. With traditional languages, however, there was a trade off between these two goals.

COMPILED LANGUAGES

Computer languages such as C and C++ (and Cobol and Fortran and many many more) are what we call "compiled" languages. The programmer edits or enters his program into a series of source files, then runs it through a compiler that produces "object files" - snippets of executable program suitable for use on a specific type of processor and operating system. A further stage, the loader (or linker or taskbuilder) connects all the object files necessary to make a complete application, together with standard libraries, into an executable program file.

Compiled languages can result in programs that run very fast indeed on their target computer, but the need to re-compile and re-load, and have different program variants depending on operating system and processor, means that the development process can be longwinded; tools such as "make" help, but a traditional compiled language will never be as easy as "change and run".

INTERPRETATIVE LANGUAGES

For applications which are to be written rapidly, and don't need to run fast, you would traditionally use an interpretative language. Many versions of Basic were/are interpretative, as are Shell programming languages, batch files, and so on. With an interpretative language, a program called the interpreter runs every time that the application is to run, and it interprets and then acts on the application line by line.

Developing a program in an interpretative language takes away all the hassle of compilers and loaders - just point the interpreter at the text file containing the program source and it will run. But it may run slowly. Each line has to be interpreted each time it is to be performed, and so you have extra run time overheads that make interpretative language code execution an order of magnitude slower that compiled code execution.

VIRTUAL MACHINE BASED LANGUAGES

Compiled languages are fast to run, and interpretative languages are simpler to develop. BUT compiled languages are more complex to develop, and interpretative languages run much slower. Is there another scheme that gives us the best of both worlds? Not quite - but we can achieve something that runs much faster than a pure interpretative system, and is easier to develop than a pure compiled system, using the concept of a virtual machine.

A virtual machine based language has both a compiler AND an interpreter. The programmer enters his code into a plain text file (exactly as he would for a compiled or an interpretative system), then presents it to the compiler. Rather than this compiler producing machine code for a particular operating system and processor, though, it produces machine code for a "virtual machine".

The virtual machine is runs after the compiler; it takes the machine code output from the compiler, and interprets and runs that machine code. The virtual machine doesn't have to do all the program analysis of a regular interpreter (that's already been done by the compiler), so it can run very much faster than such an interpreter.

One of the earliest virtual machine based systems was the P-System of UCSD (University of California at San Diego) and their Pascal compiler. The P-codes (as the intermediate compiler output was known) could be run through virtual machines on a huge range of computers with DEC, Zilog, Motorola, Intel and many other processor types. A similar concept was also adopted by Larry Wall when he created the Perl language, and more recently it has been adopted in Java and in Microsoft's .NET environment. PHP also uses am embedded "compile then execute" scheme from release 4 on, through the Zend engine.

PERL'S VIRTUAL MACHINE

From its early days, Perl used a virtual machine. In the Perl model (up to and including release 5 of Perl), the compiler and interpreter are built into the same program ("perl"), and when you run perl on a text file the file in interpreted into "byte code" (Larry's name for the intermediate compiler output) which is the passed straight on into the interpreter.

Thus, to run a Perl program all you do is type
 perl programname
and the text in the file "programname" is compiled to byte code which is then interpreted through the Perl virtual machine. It's a single step as far as the user is concerned - and after all Perl was designed to be easy to write and easy to run.

In the more recent versions of Perl (it's now up to release 5.6.1), the B::Bytecode module allows users to run a compile to produce a file which can be used to feed Perl's virtual machine at a later time, and perhaps on a different computer. The B::Bytecode module is officially described as "experimental"; Perl has been around now for over 13 years, and it's not easy to add a facility such as this to such a mature language. When we come on to Perl 6, now under development, the model changes - see the box on "Parrot".

JAVA VIRTUAL MACHINES

The whole concept of Java's portability is based on its virtual machine. Source code is entered into .java files, and then compiled into .class files, suitable for use in a Java Virtual Machine or JVM.

All Java applications make extensive use of standard classes provided by the authors of Java, and very often they'll use classes written by other suppliers too. These classes additional to the one written specifically for your application(s) will usualy be available resident on the same system on which the JVM is to be run, and the JVM plus these classes is known as the JRE or Java Runtime Environment.

Although Java (in the form or .java and .class files) is a portable language, the JRE (and in particular the JVM part of it) are specific to the processor / operating system that they're running on, and also to the environment in which they're in use. Commonly used JVMs include ones to run applets (this JVM is built into the browser, or added as a plugin), and Servlets and JSPs (this being built into Tomcat, WebSphere or JRun), as well as the JVM supplied with Java itself, and JVMs that are used on enterprise servers.

PARROT - PERL 6'S VIRTUAL MACHINE

The forthcoming Perl 6 will incorporate a brand new virtual machine, written with benefit of the experience of the old bytecode machine as well as other virtual machines such as the JVM. This new virtual machine is known as "parrot", and it's designed to suit not only Perl 6, but also Perl 5, and probably other bytecode compiled languages such as Python and Tcl.

For the more technical reader, Parrot is a register based virtual machine that supports dynamic data typing. In other words, it handles Perl's ability to switch variables from strings to numbers to references rather than having to emulate such changes more slowly within software. It's planned that the Perl 6 COMPILER will later on be able to produce output for a JVM and for .NET, but the code will run more slowly on such virtual machines that it will on Parrot.

Parrot's input is a compact binary byte format - to be known as parrot byte code. Unlike Perl 5, where the byte code was purely internal to to perl until B::Bytecode came along, this new byte code is designed for saving to file / distribution via a network, etc., and the file extension you'll most commonly see will be .pbc

HOW DO I WRITE CODE FOR A VIRTUAL MACHINE?

Usually, you write your source code in a language such as Java or Perl, and have the compiler for that language produce the appropriate byte code. Although byte code formats are published, the application developer and maintainer will rarely access them directly.

It is instructional, though, for more experienced programmers to have an understanding of byte codes and how they work. The early test releases of Parrot, which you can download from the CPAN, come with an assembler program that lets you code Parrot virtual machine instructions into a text file, and the convert them to a .pbc (Perl Byte Code) file.

Here's a sample program written in Parrot assembler:

# This is the first test of Parrot
 print "Hello birdie World - triangle numbers\n"
 print "This is a test of Parrot\n"
 set I1,0
 set I2,0
REDO: inc I2
 add I1, I2, I1
 print I2
 print " gives "
 print I1
 print "\n"
 lt I2, 10, REDO
 end

We convert that into a Parrot byte code file using the assembler program (itself written in Perl):

$ perl assemble.pl hello.par > hello.pbc
$

And we can then run the program through the parrot virtual machine:

$ parrot hello.pbc
Hello birdie World - triangle numbers
This is a test of Parrot
1 gives 1
2 gives 3
3 gives 6
4 gives 10
5 gives 15
6 gives 21
7 gives 28
8 gives 36
9 gives 45
10 gives 55
$

There's nothing to stop you looking inside the .pbc file provided you use an appropriate tool; this example looks at the file as text characters, and also as 4 byte integers:

$ od -c hello.pbc
0000000 U 1 001 \0 \0 \0 \0 \0 \0 \0 004 \0 \0 \0
0000020 s \0 \0 \0 8 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000040 \0 \0 \0 \0 & \0 \0 \0 H e l l o b i
0000060 r d i e W o r l d - t r i
0000100 a n g l e n u m b e r s \n \0 \0
0000120 s \0 \0 \0 , \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000140 \0 \0 \0 \0 031 \0 \0 \0 T h i s i s
0000160 a t e s t o f P a r r o t
0000200 \n \0 \0 \0 s \0 \0 \0 030 \0 \0 \0 \0 \0 \0 \0
0000220 \0 \0 \0 \0 \0 \0 \0 \0 \a \0 \0 \0 g i v
0000240 e s \0 s \0 \0 \0 024 \0 \0 \0 \0 \0 \0 \0
0000260 \0 \0 \0 \0 \0 \0 \0 \0 001 \0 \0 \0 \n \0 \0 \0
0000300 t \0 \0 \0 037 \0 \0 \0 \0 \0 \0 \0 037 \0 \0 \0
0000320 001 \0 \0 \0 8 \0 \0 \0 001 \0 \0 \0 \0 \0 \0 \0
0000340 8 \0 \0 \0 002 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000360 002 \0 \0 \0 232 \0 \0 \0 001 \0 \0 \0 002 \0 \0 \0
0000400 001 \0 \0 \0 032 \0 \0 \0 002 \0 \0 \0 037 \0 \0 \0
0000420 002 \0 \0 \0 032 \0 \0 \0 001 \0 \0 \0 037 \0 \0 \0
0000440 003 \0 \0 \0 x \0 \0 \0 002 \0 \0 \0 \n \0 \0 \0
0000460 \0 \0 \0 \0
0000470
$ od -l hello.pbc
0000000 20010401 0 180 4
0000020 115 56 0 0
0000040 0 38 1819043144 1768038511
0000060 1701405810 1919899424 757097580 1769108512
0000100 1818717793 1970151525 1919246957 2675
0000120 115 44 0 0
0000140 0 25 1936287828 544434464
0000160 1702109281 1864397939 1632641126 1953460850
0000200 10 115 24 0
0000220 0 0 7 1986619168
0000240 2126693 115 20 0
0000260 0 0 1 10
0000300 116 31 0 31
0000320 1 56 1 0
0000340 56 2 0 178
0000360 2 154 1 2
0000400 1 26 2 31
0000420 2 26 1 31
0000440 3 120 2 10
0000460 -14 0
0000470

There's also a disassembler supplied with parrot - not only does this help you understand what's what in the byte code, but it also helps compiler writers and later the more advanced developers see the code that's actually generated by their applications. Here's our program disassembled; if you study this in association with the output dumps and original source code, you'll start to get a real flavour of how parrot works.

#
# Disassembly of Parrot Byte Code from 'hello.pbc'
#
# Segments:
#
# * Magic Number: 4 bytes
# * Fixup Table: 0 bytes
# * Const Table: 180 bytes
# * Byte Code: 116 bytes (29 opcode_ts)
#
# Constant Type Data
# -------- ------------ ------------------------------
# 0 PFC_STRING "Hello birdie World - triangle numbers\n"
# 1 PFC_STRING "This is a test of Parrot\n"
# 2 PFC_STRING " gives "
# 3 PFC_STRING "\n"
#
# WORD BYTE BYTE CODE LABEL OPERATION ARGUMENTS
# -------- ---------- ------------------------------------------------ ------ --------------- --------------------
  00000000 [00000000]: 00000031 00000000 print [sc:0]
  00000002 [00000008]: 00000031 00000001 print [sc:1]
  00000004 [00000016]: 00000056 00000001 00000000 set I1, 0
  00000007 [00000028]: 00000056 00000002 00000000 set I2, 0
  00000010 [00000040]: 00000178 00000002 inc I2
  00000012 [00000048]: 00000154 00000001 00000002 00000001 add I1, I2, I1
  00000016 [00000064]: 00000026 00000002 print I2
  00000018 [00000072]: 00000031 00000002 print [sc:2]
  00000020 [00000080]: 00000026 00000001 print I1
  00000022 [00000088]: 00000031 00000003 print [sc:3]
  00000024 [00000096]: 00000120 00000002 00000010 -0000014 lt I2, 10, -14
  00000028 [00000112]: 00000000 end


See also Structure of Perl - Compiled or Interpretted?

Please note that articles in this section of our web site were current and correct to the best of our ability when published, but by the nature of our business may go out of date quite quickly. The quoting of a price, contract term or any other information in this area of our website is NOT an offer to supply now on those terms - please check back via our main web site

Related Material

Perl 6 Look Ahead
  [3077] Perl 6 - significantly nearer, and Rakudo looks very good - (2010-12-02)
  [2967] Multiway branches in Perl - the given and when syntax - (2010-09-22)
  [2817] Setting a safety net or fallback value in Perl - (2010-06-19)
  [2816] Intelligent Matching in Perl - (2010-06-18)
  [2815] switch and case, or given and when in Perl - (2010-06-17)
  [2559] Moving the product forward - ours, and MySQL, Perl, PHP and Python too - (2010-01-01)
  [1721] Perl 6 - When will we have a production release? - (2008-07-26)
  [1417] What software version do we teach? - (2007-10-31)
  [1215] An update on Perl - where is it going? - (2007-06-03)
  [995] Ruby's case - no break - (2006-12-17)
  [582] DWIM and AWWO - (2006-01-30)
  [550] 2006 - Making business a pleasure - (2006-01-01)
  [113] A Parallel for Perl 6 - (2004-11-09)
  [89] When will Perl 6 be available - (2004-10-15)

More about the Perl Environment
  [2876] Different perl examples - some corners I rarely explore - (2010-07-18)
  [1865] Debugging and Data::Dumper in Perl - (2008-11-02)
  [748] Getting rid of variables after you have finished with them - (2006-06-06)
  [743] How to debug a Perl program - (2006-06-04)
  [328] Making programs easy for any user to start - (2005-05-29)

resource index - Perl
Solutions centre home page

You'll find shorter technical items at The Horse's Mouth and delegate's questions answered at the Opentalk forum.

At Well House Consultants, we provide training courses on subjects such as Ruby, Lua, Perl, Python, Linux, C, C++, Tcl/Tk, Tomcat, PHP and MySQL. We're asked (and answer) many questions, and answers to those which are of general interest are published in this area of our site.

Comment: "Very lucid explanation.Thankyou very much ..."
Visitor Ranking 5.0 (5=excellent, 1=poor)

Comment by Anon (published 2010-02-14)
Very lucid explanation.Thankyou very much [#3470]

You can Add a comment or ranking or edit your own comments

Average page ranking - 5.0

© WELL HOUSE CONSULTANTS LTD., 2014: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 899360 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/solutions/perl-par ... chine.html • PAGE BUILT: Wed Mar 28 07:47:11 2012 • BUILD SYSTEM: wizard