
What Is A Distributed System?
The term distributed system
refers to a range of different technologies, these include peer-to-peer
networks, clusters and grids. Distributed computing can be used to
enhance processing power (such as for the SETI project) and for the
sharing of files (such as in corporate file sharing systems and Napster).
Peer-to-Peer Networks
Peer-to-peer networks are “flat”,
rather than hierarchical. The members of a peer-to-peer network can
communicate with each other freely, access to other member of the
network is not controlled through a central server. The scale of peer-to-peer
systems can range from a LAN, to large scale distributed systems,
such as Napster.
Clusters
A number of computers can be grouped
together in order to share processing power in
performing a task or group of tasks. Such set-ups are used
for web search engines and at CERN for processing vast amounts of
information gathered from particle collision experiments, to name
just two examples.
Grids
Grids are large scale networks
of clusters. Grids are designed to provide massive computing power
over extremely high-band width connections, that can be accessed just
be plugging in (a bit like plugging into the national grid system).
An example is the TeraGrid project, currently under construction in
the USA.
top
of page

Technologies & Approaches
Distributed computing has a whole
range of technologies available. Here just a few of the commonly used
approaches and popular technologies are considered.
Message Passing
One often
used approach is message passing, where the software is split
up into many small programs that can process by themselves, yet are
able to communicate with each other through message passing.
top
of page
CORBA
Common
Object Request Broker Architecture (CORBA) is overseen by the OMG
(Object Model Group). This is an approach to the architectural design
of software, supported by any object oriented (OO) language. The architecture
works through an Object Request Broker that can takes requests from
clients and passes them onto servers that provide the requested object.
This is very similar to the RMI architecture (see below), but unlike
RMI supports more than one programming language. Usually these objects
will encapsulate (independently represent) aspects of business logic.
top
of page
RMI
Remote Method Invocation (RMI)
is a Java technology, introduced with JDK v1.1 and higher by Sun Microsystems.
RMI allows for cross-platform distributed computing. RMI provides
true distributed computing, allowing code being executed on different
platforms to work together. RMI has three layers, shown in the diagram
below.
top
of page

Java Enterprise Beans (EJB)
These
are not to be confused with Java Beans. Java Enterprise Beans
or as they are more commonly known, Enterprise Java Beans (EJB) encapsulate
business logic and can also handle database interfaces. The encapsulation
of business logic into discrete programmatic units means that just
the business logic required can be requested. EJB is purely a Java
technology.
top
of page
XML
EXtensible
Mark-up Language (XML) is the meta-class of HTML. XML
can be used to define other mark-up languages as and when required.
Combined with a Data Transfer Document (DTD) file that describes the
form of the mark-up language, XML can be used to transfer data between
different systems in a universally accepted mode for formatting in
plan text. Parsers are able to read the XML documents in accord with
the form described by the DTD. For more information about XML, see
the World Wide Web Consortium
(W3C) web site.
top
of page

Examples Of Advanced Distributed Systems
The
technologies mentioned above, CORBA, RMI and XML, are used by corporations
around the World to implement the distribution of business logic and
data. The following are examples of cutting edge technological advances
in distributed technologies.
CERN
CERN
is the advanced centre for particle physics research, based in Switzerland.
The use of distributed computing is growing to a greater degree here
than anywhere else, after all this is the birthplace of the World
Wide Web. Currently under construction is a huge distributed system
aimed at the analysis of data from the Large Hadron Collider, which
collides subatomic particles together producing the primitive particles
that existed at the beginning of the Universe. The amounts of data
generated are so large that it is only through distributed computing
that sufficient computing power can be mustered.
http://public.web.cern.ch/public/
top
of page
Pixar
The
digital film company Pixar used a distributed arrangement of 117 Sun
Spark Stations for generating the graphics for the children’s
film Toy Story. The distributed computing used allowed complex calculations
that simulate the path of light to be performed rapidly. A single
such computer performing the same task would take more than 40 years
to produce the film. You can find out how they did it at:
http://sunsite.anu.edu.au/sunaus/toystory.html
top
of page
The Human Genome Project
In
order to store the huge amounts of data generated by examining the
human genome (the sequence of bases that make up the human genetic
code), a specially designed distributed database has been implemented.
This is part of a growing area called bioinformatics.
http://www.doegenomes.org/
top
of page
SETI
The
Search for Extra-Terrestrial Intelligence (SETI) collects vast amount
of data from radio-telescopes from around the World and satellites.
This data cannot all be processed by the computers available directly
to the SETI team. In order to enhance their computing power the SETI@home
project was launched. This involves individuals downloading a screen
saver program to their PC that will communicate with a central server
over the Internet. Data is downloaded from the server and processed,
the results being passed back to the server. Since the program is
a screen saver, the system utilises the computing power of many thousands
of people’s computers all over the World, when they are away
from their desks. You can join in SETI by downloading the screen saver
from their web site.
http://www.seti.org
http://www.seti.org/science/setiathome.html
top
of page
Cracking Encryption
The
Distributed Network group set-up a challenge to crack a highly secure
encrypted test message in the RC5 project.
http://www.distributed.net
top
of page
The TeraGrid
The
TeraGrid is a project in the USA to connect together some of the most
powerful supercomputers in a single distributed network, using extremely
high bandwidth (tens of gigabits up to terabits per second) that crosses
the continent of North America. This massive grid can be plugged into
in order to access the processing power available. The initial uses
of the TeraGrid will be to examine pollution in the environment and
the origins of the Universe.
http://www.teragrid.org
top
of page

by
Matthew
Martin