To achieve the goals of rendering as a service, mapping between user clients
C and computing resources R should be efficient and flexible. Let P be a set partition of R. A binding b(c) 7→ x is established between a client c ∈ C and a block x ∈ P when at least one computing resource r ∈ R is assigned to c. The mapping M is the union of all bindings. For example, if two clients c1
and c2 connect to an ideal service with computing resources R = {r1, . . . , r12}, and are assigned half the total resources each, then x1 = {r1, . . . , r6}, x2 = {r7, . . . , r12}, P = {x1, x2} and M ={b(c1)7→x1, b(c2)7→x2}. Client demands
for computing resources fluctuate and RaaS should be elastic enough to respond by transitioning from a mapping M to another M0 in real-time. Thus, if a
work, use point-based sampling. Monte Carlo techniques, based on ray tracing, are employed for their applicabil- ity to arbitrary geometries and reflection behaviours [30]. Another advantage of Monte Carlo techniques is stability;
they have the property that error bound is alwaysO(n 12)
regardless of dimensionality. Point-based sampling meth- ods lend themselves more to parallelisation than finite el- ement methods.
3. Related Work
Chalmers et al. [6] presented an overview of parallel rendering algorithms for high-fidelity graphics. While ray tracing is relatively straightforward to parallelise, when considering more complex lighting scenarios or when at- tempting to achieve interactive rates, parallelisation can become a challenge. Keates and Hubbold [16] presented a parallel ray tracer that achieved 1 to 5 frames per sec- ond by limiting themselves to ray casting on a 64 pro- cessor machine. Muus [21] also presented an interactive ray tracer using 96 processors for combinatorial solid mod- elling. Parker et al. [25] presented an interactive ray tracer that achieved interactive frame rates using frameless ren- dering on a dedicated 64 core shared memory multipro- cessor. Wald et al. [34] presented a parallel ray tracer running on a distributed system that achieved interactive rates via parallelism and careful use of instruction-level parallelism via SIMD CPU instructions. This work was extended to a full global illumination solution termed IGI [32]. IGI adapted Instant Radiosity [17] to be used by a ray tracing renderer. The method used interleaved sam- pling to avoid having to compute the shadow rays for all virtual point light source at each intersection point. The system was parallelised achieving close to linear speed up on a system consisting of up to 48 processors. Guenther extended this system to achieve parallel photon mapping [13]. Zhou et al. [38] presented scheduling technique for REYES rendering using distributed GPUs.
Within the distributed context, a number of publica- tions have been published that use grid computing, in par- ticular desktop grids for rendering high-fidelity graphics. Aggarwal et al. [1], presented a rendering system for com- puting high-fidelity animations using computational grids. They used an irradiance caching scheme for the implemen- tation and their system ran a two stage rendering, com- puting the irradiance cache [36] in the first stage and using the results from the merged irradiance cache in a second subsequent stage. In a follow up paper, Aggarwal et al. [2] presented rendering on desktop grids. In this work, sin- gle images were computed within time constraints set by the user. The rendering was independent of the fluctua- tions in performance of the desktop grid. This work was finally extended to interactive high-fidelity rendering [3]. This system was able to handle potential imbalances in load, that may occur through resources suddenly becom- ing available or disappearing, by distributing tasks based on quasi random sampling and maintained a consistent
frame rate via both temporal and spatial reconstruction on the server machine. This work is similar to ours in that the number of resources may change dynamically, yet ren- dering as a service is able to control and distribute the changes, while when rendering over a desktop grid the available resources are unknown. [24] propose a remote rendering service based on a single desktop setup; the fo- cus is on efficient compression and streaming of remotely rendered frames. connect input/ output allocate input/ output release disconnect
Figure 2: Rendering as a Service
4. Rendering as a Service
RaaS addresses the needs of users that require high- fidelity visualisation but lack the resources to do so (Fig. 2). There is a large unfulfilled potential for such a system; one could envisage an architect armed with just a tablet device, walking on site and receiving real-time high-fidelity visual feedback based on position and orientation. While a number of cloud services do exist, some used for render- ing also (e.g. [26]), they are not interactive and do not share the same ambitions as RaaS - we were unable to leverage such systems for RaaS so had to architect a novel system from scratch that could provide interactivity while adhering to cloud-level properties such as consistency of service. The viability of RaaS hinges on its ability to pro-
vide, for a given set of resources, various levels of graph-
ical fidelity, and response times within user expectations.
These two properties are dependent; increasing graphical fidelity will invariably increase response time, and low- ering response time, sacrifices graphical fidelity. To some degree, parallelism can be exploited to keep response times low while increasing graphical fidelity, but this is largely dependent on the scalability of the applied rendering algo- rithms. RaaS has to scale well in the number of users and share computing resources amongst them, as opposed to
dedicated rendering systems. Scalability and elasticity are
two important characteristics of RaaS, where scalability is the ability of the system to accommodate growth and elasticity is the system’s ability to re-provision a pool of resources in real-time, based on the demands set by the various users.
4.1. High-level architecture
To achieve the goals of rendering as a service, map-
ping between user clients C and computing resources R
3
Figure 5.1: Overview of Rendering as a Service. Clients connect and have re- sources allocated to them for the duration of their rendering job, after which, the resources are freed again. All computation occurs remotely, with the server streaming the output.
5. Rendering as a Service (RaaS) 89
third client c3 connects to the system and has equal priority to c1 and c2, the
system should be able to quickly transition to a partition P0 = {x10, x20, x30},
wherex10 ={r1, . . . , r4},x20 ={r5, . . . , r8} and x30 ={r9, . . . , r12}, such that the
new mapping is M0 = {b(c1)7→x10, b(c2)7→x20, b(c3)7→x30}. Such mappings may be performed quickly and efficiently by a central authority privy to both
C and R (see Figure 5.2). Client connectivity, and resource management and binding are considered two separate concerns; we propose two entities within the central authority to handle them: the Service Manager and Resource Manager
respectively.
A job constitutes the chosen smallest unit of processing that can be managed independently within the framework. It is a collection of computing resources
xm working together towards satisfying the request of a single client ci, and is
described by the bindingb(ci)7→xm. Every job has an associatedJob Controller
(see Figure 5.2), an active entity that relays client requests to bound resources and sends back responses, allowing clients to interact with resources. This is assigned upon client connection. Initially, a connected client has no resources allotted, as a job has yet to be specified. As soon as a client submits a job specification, the system can perform resource allocation (see §5.3). The number of computing resources |xm| is not directly set by client ci. Instead, specially
designated programs (either automated or user-driven) are used to determine and set resource bindings for each and every client in the system, via a fa¸cade exposed by theAdmin Controller(see Figure 5.2). Administrative and control services are also provided for system management to take place (e.g. load balancing), as well as monitoring tools (e.g. user job monitoring); the service level and functionality exposed is largely determined by the privilege level of the connected user account. Depending on priority and availability, a number of resources may be allocated to the client, allowing for job execution to start. The resources assigned to a Job Controller are all but one given the role of worker; the exception is made for the first assigned resource which is given the role of Task Coordinator (see
§5.4). This arrangement is tailored towards facilitating the integration of parallel rendering algorithms which favour the Master-Worker paradigm (see §4.2.2). It also imposes a hard upper boundcmaxon the number of clients, forntotprocessing elements such that cmax = b(ntot−nf e)
2 c, where 1 ≥ nf e ≥ ntot is the number
of processing elements allocated to the front end. At least a single processing element is required to run the front end service, while each client necessitates a minimum of two processing elements, one running the Task Coordinator and
5. Ren dering as a Service (RaaS) 90 Service Manager
Client Frontend Backend
Resource
Manager Resource1 · · · Resourcen
Job Controller1 CoordinatorTask
1 Worker1 Workerk . . . Job Controllern Task Coordinatorn Worker1 Workerm . . . Admin Controller Client1 Clientn Admin Console connect connect connect client interaction client interaction manual system management
create instance create instance create instance add/remove resources add/remove resources manage resource assignments C xf x1 xn R
Figure 3: Rendering as a service (RaaS) high-level architecture diagram
should be flexible and efficient. Let
P
be a set partition of
R. A binding
b(c)
7!
x
is established between a client
c
2 C
and a block
x
2 P
when at least one comput-
ing resource
r
2 R
is assigned to
c. The mapping
M
is the union of all bindings. For example, if two clients
c
1and
c
2connect to a system with computing resources
R
=
{r
1, . . . , r
12}, and are assigned half the total resources
each, then
x
1=
{r
1, . . . , r
6},
x
2=
{r
7, . . . , r
12},
P
=
{x
1, x
2}
and
M
=
{b(c
1)7!x
1, b(c
2)
7!x
2}. Client de-
mands for computing resources fluctuate and RaaS should
be elastic enough to respond by transitioning from a map-
ping
Mto another
M
0in real-time. Thus, if a third client
c
3connects to the system and has equal priority to
c
1and
c
2, the system should be able to quickly transition to a par-
tition
P
0=
{x
10, x
20, x
30}, where
x
10=
{r
1, . . . , r
4},
x
20=
{r
5, . . . , r
8}
and
x
30=
{r
9, . . . , r
12}, such that the new
mapping is
M
0=
{b(c
1)7!x
10, b(c
2)
7!x
20, b(c
3)7!
x
30}.
Such mappings may be performed quickly and efficiently
by a central authority (Fig. 3
centre) privy to both
C
and
R
(Fig. 3). Client connectivity, and resource manage-
ment and binding are considered two separate concerns;
we propose two entities within the central authority to
handle them: the
Service Manager
and
Resource Manager
respectively.
A
job
constitutes the smallest unit of processing that
can be managed independently within our system. It is
a collection of computing resources
xm
working together
towards satisfying the request of a single client
cn
and is
described by the binding
b(cn)
7!
xm. Every job has an
associated
Job Controller
(Fig. 3), an active entity that
relays client requests to bound resources and sends back
responses, allowing clients to interact with resources. This
is assigned upon client connection. Initially, a connected
client has no resources allotted, as a job has yet to be spec-
ified. As soon as a client submits a job specification, the
system can perform resource allocation (see
§4.2). Com-
puting resources
xm
are not directly set by client
cn. In-
stead, specially designated programs (either automated or
user-driven) are used to determine and set resource bind-
ings for each and every client in the system, via a fa¸cade
exposed by the
Admin Controller
(Fig. 3). Other admin-
istrative and control services are also provided for system
management to take place (e.g. load balancing). Depend-
ing on priority and availability, a number of resources may
be allocated to the client, allowing for job execution to
start. The resources assigned to a Job Controller are all
but one given the role of
worker, except for the single case
when it is given the role of
Task Coordinator
(see
§4.3).
This arrangement is tailored towards facilitating the inte-
gration of parallel rendering algorithms which favour the
Master-Worker paradigm.
4.2. Resource management
Computing resources managed by the Resource Man-
ager each consist of a single
processing element
(PE) and
some measure of private primary memory. A resource may
be characterised by two distinct states,
idle
or
busy. An
idle resource can do no useful work unless it is bound to
a Job Controller. It resides in an unbound block
xf
in
P, termed the
free pool. When a resource is bound (Fig.
4(b)), it moves out of the free pool and performs initial-
isation. For each job, the Task Coordinator is the only
resource to receive initialisation instructions from the Job
Controller; it must then pass this information on to other
workers, to coordinate their initialisation. Moreover, it is
responsible for disseminating state information (e.g. user
4
Figure 5.2: High-level architecture diagram for Rendering as a Service that includes the major components of the system. Every connecting client is assigned a Job Controller for mediating communication with the Task Coordinator and Workers, which carry out the actual rendering computation. Note thatxf is equivalent to the free pool.
5. Rendering as a Service (RaaS) 91
another a single worker.