Networks course plan

Distributed Systems

by

Matthew Martin

Contents

    1. Peer-to-Peer Networks
    2. Clusters
    3. Grids
  1. Technologies & Approaches
    1. Message Passing
    2. CORBA
    3. RMI
    4. Java Enterprise Beans (EJB)
    5. XML
  2. Examples Of Advanced Distributed Systems
    1. CERN
    2. Pixar
    3. The Human Genome Project
    4. SETI
    5. Cracking Encryption
    6. The TeraGrid

top of page

What Is A Distributed System?

The term distributed system refers to a range of different technologies, these include peer-to-peer networks, clusters and grids. Distributed computing can be used to enhance processing power (such as for the SETI project) and for the sharing of files (such as in corporate file sharing systems and Napster).

Peer-to-Peer Networks

Peer-to-peer networks are “flat”, rather than hierarchical. The members of a peer-to-peer network can communicate with each other freely, access to other member of the network is not controlled through a central server. The scale of peer-to-peer systems can range from a LAN, to large scale distributed systems, such as Napster.

Clusters

A number of computers can be grouped together in order to share processing power in  performing a task or group of tasks. Such set-ups are used for web search engines and at CERN for processing vast amounts of information gathered from particle collision experiments, to name just two examples.

Grids

Grids are large scale networks of clusters. Grids are designed to provide massive computing power over extremely high-band width connections, that can be accessed just be plugging in (a bit like plugging into the national grid system). An example is the TeraGrid project, currently under construction in the USA.

top of page

Technologies & Approaches

Distributed computing has a whole range of technologies available. Here just a few of the commonly used approaches and popular technologies are considered.

Message Passing

One often used approach is message passing, where the software is split up into many small programs that can process by themselves, yet are able to communicate with each other through message passing.

top of page

CORBA

Common Object Request Broker Architecture (CORBA) is overseen by the OMG (Object Model Group). This is an approach to the architectural design of software, supported by any object oriented (OO) language. The architecture works through an Object Request Broker that can takes requests from clients and passes them onto servers that provide the requested object. This is very similar to the RMI architecture (see below), but unlike RMI supports more than one programming language. Usually these objects will encapsulate (independently represent) aspects of business logic.

top of page

RMI

Remote Method Invocation (RMI) is a Java technology, introduced with JDK v1.1 and higher by Sun Microsystems. RMI allows for cross-platform distributed computing. RMI provides true distributed computing, allowing code being executed on different platforms to work together. RMI has three layers, shown in the diagram below.

 

top of page

Java Enterprise Beans (EJB)

These are not to be confused with Java Beans. Java Enterprise Beans or as they are more commonly known, Enterprise Java Beans (EJB) encapsulate business logic and can also handle database interfaces. The encapsulation of business logic into discrete programmatic units means that just the business logic required can be requested. EJB is purely a Java technology.

top of page

XML

EXtensible Mark-up Language (XML) is the meta-class of HTML. XML can be used to define other mark-up languages as and when required. Combined with a Data Transfer Document (DTD) file that describes the form of the mark-up language, XML can be used to transfer data between different systems in a universally accepted mode for formatting in plan text. Parsers are able to read the XML documents in accord with the form described by the DTD. For more information about XML, see the World Wide Web Consortium (W3C) web site.

top of page

Examples Of Advanced Distributed Systems

The technologies mentioned above, CORBA, RMI and XML, are used by corporations around the World to implement the distribution of business logic and data. The following are examples of cutting edge technological advances in distributed technologies.

CERN

CERN is the advanced centre for particle physics research, based in Switzerland. The use of distributed computing is growing to a greater degree here than anywhere else, after all this is the birthplace of the World Wide Web. Currently under construction is a huge distributed system aimed at the analysis of data from the Large Hadron Collider, which collides subatomic particles together producing the primitive particles that existed at the beginning of the Universe. The amounts of data generated are so large that it is only through distributed computing that sufficient computing power can be mustered.

http://public.web.cern.ch/public/

top of page

Pixar

The digital film company Pixar used a distributed arrangement of 117 Sun Spark Stations for generating the graphics for the children’s film Toy Story. The distributed computing used allowed complex calculations that simulate the path of light to be performed rapidly. A single such computer performing the same task would take more than 40 years to produce the film. You can find out how they did it at:

http://sunsite.anu.edu.au/sunaus/toystory.html

top of page

The Human Genome Project

In order to store the huge amounts of data generated by examining the human genome (the sequence of bases that make up the human genetic code), a specially designed distributed database has been implemented. This is part of a growing area called bioinformatics.

http://www.doegenomes.org/

top of page

SETI

The Search for Extra-Terrestrial Intelligence (SETI) collects vast amount of data from radio-telescopes from around the World and satellites. This data cannot all be processed by the computers available directly to the SETI team. In order to enhance their computing power the SETI@home project was launched. This involves individuals downloading a screen saver program to their PC that will communicate with a central server over the Internet. Data is downloaded from the server and processed, the results being passed back to the server. Since the program is a screen saver, the system utilises the computing power of many thousands of people’s computers all over the World, when they are away from their desks. You can join in SETI by downloading the screen saver from their web site.

http://www.seti.org

http://www.seti.org/science/setiathome.html

top of page

Cracking Encryption

The Distributed Network group set-up a challenge to crack a highly secure encrypted test message in the RC5 project.

http://www.distributed.net

top of page

The TeraGrid

The TeraGrid is a project in the USA to connect together some of the most powerful supercomputers in a single distributed network, using extremely high bandwidth (tens of gigabits up to terabits per second) that crosses the continent of North America. This massive grid can be plugged into in order to access the processing power available. The initial uses of the TeraGrid will be to examine pollution in the environment and the origins of the Universe.

http://www.teragrid.org

top of page

by

Matthew Martin

top of page