Lecture 1:
Multi-Tier Organization of an
Information System
Gustavo Alonso
Systems Group
Computer Science Department - ETH Zurich
alonso@inf.ethz.ch
Contents
Design of an information system
Layers and tiers
Bottom up design
Top down design
Architecture of an information system
One tier
Two tier (client/server)
Three tier (middleware)
N-tier architectures
Clusters and tier distribution
Why Tiers?
Information Systems are divided into different tiers.
The tiers can be conceptual or real
The tiers can be implemented using different technologies or using a single vendor stack (cfr. vendor lock-in,
standards)
Distinguishing between tiers is, in
practice, not an obvious exercise as the different system modules will also be divided according to business function (the system of the legal department, the system of the accounting department, etc.)
While the different tiers are a software engineering concept, they have almost always appeared as a result of changes in technology (computers, hardware, or networking)
The notion of tiers will allow us to describe the architecture of modern IT infrastructures abstracting from the system details
Tiered architecture
©Gustavo Alonso, ETH Zürich. 5
Some practical notation
Horizontal layer: refers to a layer
that is application independent and
intended to work on a wide range of
business environments
Vertical architecture/solution: refers
to systems tailored to concrete
applications and industry branches
(e.g., insurance or banking)
Single vendor stack: all the layers
are produced by a single vendor
(e.g., WebSphere of IBM,
WebLogic of BEA)
Host (hosted solution): mainframe,
mainframe based applications
TCO (Total Cost of Ownership): the
global cost of an IT system,
including maintenance,
hw-sw-people, licenses, etc.
INTEGRATION TIER ACCESS TIER CLIENT TIER APP TIER wrapper wrapper wrapper business object business object business object api api api web client java client WS client
Web servers, RMI, RPC J2EE, CGI, JAVA Servlets API
databases, multi-tier systems system federations, filters object brokers, message brokers
app. Servers, TP-Monitors, stored procedures, Java beans
Web browsers
specialized clients (Java, .NET) Web Services...
HTML, SOAP, XML
MOM, HTML, IIOP, RMI-IIOP, SOAP, XML
MOM, IIOP, RMI-IIOP, XML
©Gustavo Alonso, ETH Zürich. 7
A game of boxes and arrows
Each box represents a part of the system.
Each arrow represents a connection
between two parts of the system.
The more boxes, the more modular the
system: more opportunities for distribution
and parallelism.
This allows encapsulation, component
based design, reuse.
The more boxes, the more arrows: more
sessions (connections) need to be
maintained, more coordination is
necessary. The system becomes more
complex to monitor and manage.
The more boxes, the greater the number of
context switches and intermediate steps to
go through before one gets to the data.
Performance suffers considerably.
System designers try to balance the
flexibility of modular design with the
performance demands of real applications.
Once a layer is established, it tends to
migrate down and merge with lower layers.
There is no problem in system
design
that cannot be solved by
adding a level of indirection.
There is no
performance
problem that cannot be solved
Layers and tiers
Client
is any user or program that
wants to perform an operation over the
system. Clients interact with the system
through a
presentation layer
The
application logic
determines what
the system actually does. It takes care
of enforcing the business rules and
establish the business processes. The
application logic can take many forms:
programs, constraints, business
processes, etc.
The
resource manager
deals with the
organization (storage, indexing, and
retrieval) of the data necessary to
support the application logic. This is
typically a database but it can also be a
text retrieval system or any other data
management system providing querying
capabilities and persistence.
©Gustavo Alonso, ETH Zürich. 9
Top down design
The functionality of a system is divided among several modules. Modules cannot act as a separate component, their
functionality depends on the functionality of other modules.
Hardware is typically homogeneous and the system is designed to be distributed from the beginning.
top-down design
PL-A PL-B
PL-C
AL-A AL-B
RM-1 RM-2
top-down architecture
RM-1 RM-2
AL-A
AL-B
PL-A PL-B
PL-C
Top down design process
presentation layer
resource management layer
application logic layer
client 1. define access channels
and client platforms
2. define presentation formats and protocols for the selected clients and protocols
3. define the functionality necessary to deliver the
contents and formats needed at the presentation layer
4. define the data sources and data organization needed to implement the application logic
©Gustavo Alonso, ETH Zürich. 11
Bottom up design
In a bottom up design, many of the basic components already exist. These are stand alone systems which need to be integrated into new systems.
The components do not necessarily cease to work as stand alone
components. Often old applications continue running at the same time as new applications.
This approach has a wide application because the underlying systems already exist and cannot be easily replaced.
Much of the work and products in this area are related to middleware, the intermediate layer used to provide a
common interface, bridge heterogeneity, and cope with distribution.
Legacy systems
New
Bottom up design
bottom-up design
PL-A PL-B
PL-C
AL-A AL-B
AL-A
AL-B
PL-A PL-B
PL-C
wrapper wrapper wrapper
wrapper wrapper wrapper
legacy application legacy
©Gustavo Alonso, ETH Zürich. 13
Bottom up design process
presentation layer
resource management layer
application logic layer
client 1. define access channels
and client platforms
2. examine existing resources and the functionality
they offer
3. wrap existing resources
and integrate their functionality into a consistent interface
4. adapt the output of the application logic so that it
can be used with the required access channels and client
protocols
System design
In most practical settings, system design happens mostly bottom up
Legacy systems
Existing infrastructures
Reuse of existing services
The main cost of bottom up design lies on the wrappers and interfaces to older applications
Without proper planning, these
interfaces and wrappers become not only performance problems, they also turn into major limitations in terms of functionality
Evolving the system may require to redo everything again as the different
elements become very dependent on the implementation of others
Top down design is conceptually more attractive since it allows the designer to make clean choices and build elegant systems
In practice, however, a complete top down design can only be done if:
Nothing exists already
or
The systems to use have clear and well defined interfaces that abstract from the details of the system
underneath. These interfaces should also be independent of the
implementation (this is what Web services are all about)
Layered design
One Tier
Everything in one big black box
(presentation, logic and resources)
Examples:
Mainframe applications
Virtualization
One Tier can be a physical single
tier (older mainframe applications)
or virtual (a web server as seen by a
web browser)
Two Tier
Presentation layer (client) separated
from logic and resource layers
(server)
Examples:
Applet
Google Earth
Presentation layer is not just the
interface but the processing of the
Three Tier
Each layer separated
Examples:
CORBA
J2EE
Most web servers
Three tier is just a way to architect
an application. Typically, three tier
systems are implemented with
middleware: platforms supporting
the development of the middle tier
N Tier
Generalization of the above
©Gustavo Alonso, ETH Zürich. 17
Historical perspective
One Tier
Mainframe
No API
Dumb terminals
Two Tier:
Client-Server
New technology • PCs
• Networking
New software concepts • RPC
• API
Three Tier:
Middleware
New Technology • Internet
• Server proliferation
Functional perspective
One Tier Highly centralized
No client code
Highly optimized • Maintenance • Distribution • Performance
Two Tier
Use performance of client
Client specialization
Integration of Applications
Three Tier
Platform based
Better management of resources (pooling)
Clusters of computers
Dynamic allocation of resources
©Gustavo Alonso, ETH Zürich. 19
Working with the perspectives
There is no such thing as a wrong architecture:
Architectures are more or less appropriate for a scenario
All architectures have limitations
But often they do not matter since the intended use will never reach the
limits
Know the limits of each architecture
Past use and problems as technology evolves do not “kill” an architecture
All forms of architectures are in use today
All of them are useful in given contexts
Key to choosing the right layered architecture is knowing
The requirements (performance, architectural, evolution)
The intrinsic properties of each architecture
©Gustavo Alonso, ETH Zürich. 21
One tier: fully centralized
The presentation layer, application logic
and resource manager are built as a
monolithic entity.
Users/programs access the system through
display terminals but what is displayed
and how it appears is controlled by the
server.
(These are “dumb” terminals).
This was the typical architecture of
mainframes, offering several advantages:
no forced context switches in the
control flow (everything happens
within the system),
all is centralized, managing and
controlling resources is easier,
the design can be highly optimized by
blurring the separation between layers.
What is a mainframe?
http://www.mainframes.com/whatis.htm
1-tier architecture
Physical one tier architectures
One tier systems tend to be monolithic: No need for interfaces to the outside world (which eventually lead to the problem of screen scraping: Google for “host screen scraping”))
All the functionality sits in a single system and runs best in a single (large) computer
Are very difficult to take apart and divide in different modules (this is why mainframes are still around)
Historically, as long as one tier systems were the norm:
There was no need for standardization of interfaces
Separation between application, infrastructure, and operating system was fuzzy (e.g., discussions on transactional operating systems)
It was realistic to think about using dedicated architectures to run some of these applications
Today, one tier systems still abound:
Legacy systems that are difficult or very costly to replace
Offer a degree of reliability that is not clearly reachable with other solutions
Software optimized and tested over a long time period (unlike new software)
©Gustavo Alonso, ETH Zürich. 23
Integration through Screen Scraping
Technically, screen scraping is integration through the presentation layer by extracting information from the screens presented to the user.
Practically, screen scraping involves capturing the information in the screen, parsing its contents, and extracting the relevant data.
There are many tools that help with screen scraping both for host environments and for HTML/web pages.
Screen scraping is expensive, difficult to maintain, as well as a solution with very limited bandwidth and high latency
One tier is not the same as mainframe
Older one tier architectures run on mainframes but not all applications on mainframes are one tier (modern ones are definitely not)
Mainframe is less a concrete computer architecture than a set of concepts that are used in large IT data centers
Mainframes are used today because of
Reliability, serviceability and fault tolerance (availability) – the hardware in a
mainframe has built in self-checking and self-recovering, and can seamlessly recover from errors
Scalability – mainframes have an architecture that is highly optimized with
dedicated processors for OS tasks and a very high I/O bandwidth. It is difficult to beat a mainframe in terms of performance for large batch jobs or loads with a very large number of concurrent users.
Mainframes host many legacy applications that would be very costly to port to other language/platforms
Browse the manual of an IBM mainframe line and OS for more information:
©Gustavo Alonso, ETH Zürich. 25
Why mainframes today?
There are many reasons why mainframes are still in use today:
High performance computing: everything else being equal (CPU power,
memory, resources), a centralized application will always be faster than a
distributed one (because it does not have the networking overhead)
Reliability: mainframes implement reliability in hardware and can mask
hardware failures automatically
Disk I/O: mainframes implement I/O through dedicated processors. An
standard mainframe has many such processors and can run a large number
of I/O channels (in the hundreds of thousands). The CPU does not stop
processing to do I/O, reducing context switches and I/O overhead (this is
very important in large databases, data warehouses, and large applications in
general)
Virtual one tier architectures as pattern
Consider it an architectural pattern Advantages:
No software distribution (or limited) • Example: web browser (in spirit)
Centralized upgrades and maintenance
Software as a service
• Charge per access, work, or data volume • No need for licensing software at the client
Centralized control
Disadvantages
No use of resources at the client
Client diversity has to be dealt with internally
Integration implies embedding the system to be integrated in the one-tier system (not practical at large scale, example mash-ups)
This pattern is becoming more and more important
Software as a service
Search engines (Google, Yahoo)
Two tier: client/server
As computers became more powerful, it was possible to move the presentation layer to the client. This has several advantages:
Clients are independent of each other: one could have several presentation layers
depending on what each client wants to do. One can take advantage of the computing
power at the client machine to have more sophisticated presentation layers. This also saves computer resources at the server machine.
It introduces the concept of API (Application Program Interface). An interface to invoke the system from the outside. It also allows designers to think about federating the systems into a single system.
The resource manager only sees one client: the application logic. This greatly helps with performance since there are no client connections/sessions to maintain.
2-tier architecture
©Gustavo Alonso, ETH Zürich. 29
API in client/server
Client/server systems introduced the notion of service (the client invokes a service implemented by the server)
Together with the notion of service, client/server introduced the notion of service interface (how the client can invoke a given service)
Taken all together, the interfaces to all the services provided by a server (whether there are application or system specific) define the server’s Application Program Interface (API) that describes how to interact with the server from the outside
Many standardization efforts were triggered by the need to agree to common APIs for each type of server
resource management layer
service
interface interface service interface service interface service
server’s API
service service
Technical aspects of client-server
There are clear technical advantages when going from one tier to two tier architectures:
take advantage of client capacity to off-load work to the clients
work within the server takes place within one scope (almost as in 1 tier)
the server design is still tightly coupled and can be optimized by ignoring presentation issues
still relatively easy to manage and control from a software engineering point of view
However, two tier systems have disadvantages:
The server has to deal with all possible client connections. The maximum number of clients is given by the number of connections supported by the server.
Clients are “tied” to the system since there is no standard presentation layer. If one wants to connect to two systems, then the client needs two presentation layers.
There is no failure or load encapsulation. If the server fails, nobody can work. Similarly, the load created by a client will directly affect the work of others since they are all competing for the same resources.
©Gustavo Alonso, ETH Zürich. 31
Scalability limitations of client-server
On the server side On the client side
Server
x 10’s x 100’s x 1000’s x 10000’s
… (internet scale)
Clients
x 10’s x 100’s x 1000’s x 10000’s
… (internet scale)
Client
Architectural limitations of client-server
the client is the point of
integration (increasingly fat
clients)
The responsibility of dealing
with heterogeneous systems is
shifted to the client.
The client becomes responsible
for knowing where things are,
how to get to them, and how to
ensure consistency
This is tremendously inefficient
from all points of view (software
design, portability, code reuse,
performance since the client
capacity is limited, etc.).
There is very little that can be done
to solve this problems if staying
within the 2 tier model.
Server A
Server B
If clients want to access two or
more servers, a 2-tier architecture
causes several problems:
the underlying systems don’t
know about each other
©Gustavo Alonso, ETH Zürich. 33
Thin vs Fat clients
Thin clients
Minimal functionality, typically
restricted to the presentation layer
Advantages:
Small code base
Easy to update and upgrade
Easy to port across platforms
Complexity is left in the server
Disadvantages:
Dos not use computing
capabilities of client (storage,
CPU, memory)
Functionality of the client
necessarily limited
Not everything can be done at
the server
Fat clients
Extended functionality, including
parts of the application layer and
integration code
Advantages:
Client is powerful and does not
load the server
Client processing can be tailored
to different users
Added value at the client side
(commercial advantage)
Disadvantages:
Larger code base
Increases maintenance and
upgrades cost
Two tier architectures as pattern
Two tier is a common integration pattern Advantages
Uses resources at the client
Customization by modifying the client
Client can be a UI or an application
Forces the application to have an interface
Integration is easier through the client interface
Centralized control for server (one –tier)
Disadvantages
Software at the client (remote) is part of the system
Maintenance requires coordinating server and client
Backward compatibility issues for older clients
Stateless vs Stateful clients
Connection management at the server becomes bottleneck
Client is server dependent (no common client across servers)
Performance loss through context switch client-server and networking
Three tier: middleware
In a 3 tier system, the three layers
are fully separated.
The layers are also typically
distributed taking advantage of the
complete modularity of the design
(in two tier systems, the server is
typically centralized)
A middleware based system is a 3
tier architecture. This is a bit
oversimplified but conceptually
correct since the underlying systems
can be treated as black boxes. In
fact, 3 tier makes only sense in the
context of middleware systems
(otherwise the client has the same
problems as in a 2 tier system).
©Gustavo Alonso, ETH Zürich. 37
Middleware
Middleware is just a level of indirection between clients and other layers of the system.
It introduces an additional layer of business logic encompassing all underlying systems.
By doing this, a middleware system:
simplifies the design of the clients by reducing the number of interfaces,
provides transparent access to the underlying systems,
acts as the platform for inter-system functionality and high level
application logic, and
takes care of locating resources,
accessing them, and gathering results.
But a middleware system is just a system like any other! It can also be 1 tier, 2 tier, 3 tier ...
Integration logic Clients
Resource managers Application logic
Server A
Server B
Technical aspects of middleware
The introduction of a middleware layer helps in that:
the number of necessary interfaces is greatly reduced:
• clients see only one system (the middleware)
• local applications see only one system (the middleware)
it centralizes control and provides a common integration platform
it makes necessary functionality widely available to all clients
it allows to implement functionality that otherwise would be very difficult to
provide (e.g., transactions)
it is a first step towards dealing with some forms of application
heterogeneity and integration.
The middleware layer does not help in that:
it is another indirection level,
it is complex software,
©Gustavo Alonso, ETH Zürich. 39
A three tier middleware based system
External clients connecting logic control user logic internal clients 2 tie r sy ste m s Resource managers wrappers
middleware
Resource manager 2 tier systemThree tier as pattern
The introduction of a middle tier is a common mechanism to solve design problems of all kinds
Advantages
Middle tier allows to implement additional functionality without touching the server
Can act as point of integration
Hides servers from clients
Hides clients from servers
Disadvantages
Performance
Additional logic in the system (often difficult to trace if not well designed)
N-tier: connecting to the Web
N-tier architectures result from
connecting several three tier systems to each other and/or by adding an additional layer to allow clients to access the system through a Web server
The Web layer was initially external to the system (a true additional layer); today, it is slowly being incorporated into a presentation layer that resides on the server side (part of the
middleware infrastructure in a three tier system, or part of the server directly in a two tier system)
The addition of the Web layer led to the notion of “application servers”, which was used to refer to
middleware platforms supporting access through the Web
client
resource management layer
application logic
layer middleware
presentation layer Web server
Web browser
©Gustavo Alonso, ETH Zürich. 43
Each tier adds functionality
INTERNET FIREWALL LAN Web server cluster LAN, gateways LAN internal clients LAN middleware application logic resource management layer database LAN middleware application logic additional resource LAN Wrappers and gateways
file application