263-3855-00: Cloud Computing Architecture

Section 2 Cloud Computing Services, IaaS, SaaS, and FaaS

Swiss Federal Institute of Technology Zurich

Eidgenössische Technische Hochschule Zürich

Last Edit Date: 02/20/2025

Disclaimer and Term of Use:

We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file

This personal note is adapted from Professor Ana Klimovic. Please contact us to delete this file if you think your rights have been violated.

This work is licensed under a Creative Commons Attribution 4.0 International License.

The cloud in an economic context¶

The cloud has changed the economy of computing. Software and hardware used to be manufacturing products, bought and operated by the client. The cloud turns them into services: made available by a provirer, pay-per-user, ownership does not shift to the client.

Significant impliations for the IT industry:

The demand for computing infrastructure shifts to cloud providers
Requires huge capital investments to become a player (hyperscalers)
Less emphasis on standards and compatibility (specilization)

Hyperscalers: Amazon, Microsoft, Google, Alibaba, Baidu, Tencent, Meta, plus a bunch of smaller cloud providers.

No description has been provided for this image

The hyperscalers are very big, which donimate the server market and build their own hardware.

This has been a major shift in the IT market that is having a huge impact on how technology evolves.

Cloud economics for the user: CapEx vs. OpEx¶

Capital Expenses are investment. They become an asset of the company. Cost spread over time through depreciation.

Operational Expenses are cost. They are the price of doing business. Typically tax deductible.

The cloud turns what used to be capital expenses into operational expenses, changing the equation of the IT infrastructure.

The cloud has enabled many companies to scale to sizes that would have been difficult because of the upfront investment required.

Why is using the cloud cost-efficient?¶

For each individual user, resources are often underutilized (as low as 15%). But resources must be allocated and on, in case they are needed on short notice. Total cost of ownership (TCO) is very high compared to usage.

The cloud improves efficienty by letting users share resources. Resource utilization increases from the cloud provider perspective. Users pay only for the resources they use (conditions may apply).

However

The cloud as a service¶

Traditional On-Premises IT	Colocation	Hosting	IaaS	PaaS	SaaS
Data	Data	Data	Data	Data	Data
Application	Application	Application	Application	Application	Application
Databases	Databases	Databases	Databases	Databases	Databases
Operating System	Operating System	Operating System	Operating System	Operating System	Operating System
Virtualization	Virtualization	Virtualization	Virtualization	Virtualization	Virtualization
Physical Servers	Physical Servers	Physical Servers	Physical Servers	Physical Servers	Physical Servers
Network & Storage	Network & Storage	Network & Storage	Network & Storage	Network & Storage	Network & Storage
Data Center	Data Center	Data Center	Data Center	Data Center	Data Center

Black text means provider supplied, red text means self-managed.

From on-premise to the cloud¶

Simply put, the difference between on-premise vs cloud software is the location. On-premise software is installed and runs on a company's own hardware infrastructure, and is hosted locally, whereas cloud software is stored and managed on the provider's servers, and accessed through a web browser or other interface.

Infrastructure as a Service (IaaS)¶

Rent a virtual machine (scurely isolated partition of server) on-demand.

Equivalent to renting a server, enabled through virtualization:

Can rent a VM (actual computing node is shared)
Can also rent a dedicated server (virtualized but not shared)
Can also rent a "bare metal" server (neither virtualized nor shared)

Useful to those who need access to servers but can / want to build the rest of the stack themselves.

IaaS Examples¶

Amazon Elastic Compute Cloud (EC2)

Many different instances

Several different pricing models such as preemptible / spot VMs

Additional service available

Persistent storage volume
Autoscaling
Monitoring and alerts

Platform as a Service (PaaS)¶

Besides hardware, you also need system software to run applications. Many cloud applications need a common set of features such as autoscalers, load balancers, distributed caches. In PaaS, what is "rented" is not just a VM but the whole software and hardware infrastructure needed to run an application. The user still puts the application together using the services provided but PaaS reduces the amount of code and takes advantage of cloud features.

PaaS Example¶

Deploying and autoscaling web apps.

Software as a Service (SaaS)¶

With IaaS, you get the computers, you build the rest. With PaaS, you get the computers and middleware, you build the application. With SaaS, you get the application, you provide the data to populate it.

SaaS can be offered by companies other than the cloud providers, using the cloud infrastructure (or their own cloud).

SaaS Examples¶

Microsoft Office 365, Google Mail, Dropbox, etc. Also for enterprise such as Snowflake (data analytics), Salesforce, etc.

Snowflake:

Data analytics platform
Data in AWS S3
Virtual warehouses in AWS EC2
Some computation in Lambdas
Heavy use of cloud services to implement the rest of the system

Users see a data analytics engine, not a cloud platform.

Functions as a Service (FaaS)¶

IaaS and PaaS still require you to think about the overall architecture. SaaS only let you do waht the specific software product enables. FaaS allows users to encapsulate a small application into a function (e.g., a container image) and the cloud platform takes care of scheduling, allocating resources for, and executing the function.

No reservation of VM or provisioning of instances
Automatic scalability
Automatic triggering
Charged by millisecond

How does FaaS work?¶

Serverless computing¶

Infrastructure as a service is like renting a car

Rend on-demand, insurance included
Still need to do a lot of work: plan route, make sure have enough gas, concentrate on the road, monitor traffic
Pay even when not driving

Severless computing is like taking a train

A high-level abstraction to the cloud
Users specify what needs to be done, not how
- Train machine learning model A on dataset B
- Run function X when data becomes available in storage bucket Y

New cloud computing paradigm that raises the level of abstraction for users and offers high elasticity & fine-grained billing.

Users write application code and specify event triggers. Cloud providers take care of resource allocation, scaling, and scheduling. Pay per millisecond of resource usage, pay zero when no load.

FaaS application examples¶

Today, FaaS works well for small tasks that are highly parallelizable and triggered by events (e.g., data becoming available in storage bucket).

FaaS as a supercomputer on demand?¶

Can use FaaS to spin up thousands of tiny threads in parallel to accelerate various tasks on-demand:

Encode / decode videos
Compile large software code-bases
Run unit tests
Data analytics

But many challenges arise.

Limitations of FaaS today¶

Today's commercial FaaS impose many restrictions:

Functions are stateless (no persistent state across invocations)
Functions cannot establish direct network connections between each other (IP addresses not exposed)
Limited execution time per function invocation (e.g., 15 min in AWS Lambda)
Limited compute / memory / network bandwidth per function, scaled in fixed ratios

Direct communication between serverless tasks is difficult. Tasks are short-lived and stateless.

Solution is to pass data through a shared remote data store.

Why serverless?¶

Cloud user perspective¶

Simple to use (do not need to be a cloud / system expert)
Reduced cost when there is no load
Handle sporadic bursts of requests by quickly spinning up infrastructure
Automatic resource scaling, based on events that trigger the function

Cloud provider perspective¶

Serverless gives cloud providers the opportunity to optimize resource efficiency (i.e. cost) under the hood, sinec they control resource allocation for tasks and have a global view across users.

However, today's FaaS platforms are highly inefficient. Lots of room for improvement in the system software design.

FaaS challenges for providers¶

Scheduling fine-grain, short-running functions from different users is hard

Need to densely bin-pack functions per server to get high resource utilization
But also need to securely isolate functions in sandboxes (e.g., VMs)
Function sandboxes need to boot fast and be scheduled fast, since the functions themselves may only run for a few milliseconds

Warm vs. cold start¶

To optimize performance, providers often keep function sandboxes "warm" in memory.

Problem: sandbox still consumes memory even when not executing a request.

High memory overhead of hot sandboxes¶

The cloud platform needs to commit DRAM for each hot sandbox, but memory is expensive.

FaaS paradox¶

Serverless gives providers the opportunity to be more efficient by optimizing scheduling decisions.

But current FaaS is highly inefficient. Why?

Simply retrofitting the traditional software stack originally designed for coarse-grained, long-running applications leads to high overhead

The provider has little info about application characteristics to optimize scheduling

Designing cloud applications¶

Tiers¶

Cloud software systems are typically divided into different tiers. Tiers can be conceptual or real.

Basic Tiers¶

Presentation Layer (client, external API layer): Enables interaction with the outside world (end-user, other apps).

Application Logic (business rules, business processes): Defines thesystem functionality.

Data Layer (databases, storage, business objects): This is where the data being used and processed is stored and managed.

Trend towards more tiers

1-tier architecture: fully centralized¶

The beginning of computing: mainframe architexture. The presentation layer, application logic and data are built as a monolithic entity. Users / programs access the system through display terminals but what is displayed and how it appears is controlled by the server ("dumb" terminals). As computers became more powerful, we could move the presentation layer to the client.

Advantages:

Clients are independent of each other: can have several presentation layers depending on what each client wants.
Can use computing power at the client machine to have more sophisticated presentation layers without requiring extra resouorces at the server machine.

2-tier architecture¶

Introduce the notion of a service: the client invokes a service implemented by the server. Also comes the notion of a service interface: how te client can invoke a given service. The interfaces to all the services provided by a server: the server's app, program interface (API). Many standardization efforts to agree on common APIs for each type of server.

2-tier architectures as a system design pattern¶

Advantages

Leverage resources at the client
Customization by modifying the client
Forces the application to have an interface
Integration is easier through the client
Centralized control for server (1-tier)

Disadvantages

Software at the (remote) clients is part of the system
Maintenance requires coordinating server and client
Backward compatibility issues for older clients
Stateless vs. stateful clients
Managing connections at server becomes bottleneck
Performance loss through context switch client-server and networking

Scalability limitations of client-server¶

Architectural limitations of client-server¶

If clients want to access 2 or more servers, a 2-tier architecture causes several problems:

Servers don't know about each other; the client is the point of integration, which increasingly "fat" clients.
There is no common business logic between the servers.

This is very inefficient from all points of view (software design, portability, code reuse, performance since the cliient capacity is limited, etc.).

Example of a client system: web browser¶

A standardized platform for clients. Web browsers (HTTP, HTML, and everything that came afterwards)

Provided the universal client platform
Decouple the client from the server as the client side is delivered on demand
Reduced the overhead of managing clients (managing only browsers)
Standardized the presentation layer language and communication protocol with the presentation layer (needed for the internet).

Initially for reading content and simple forms, today a full computing platform - JavaScript.

3-tier architecture: middleware¶

In a 3 tier system, the three layers are fully separated. The layers are also typically distributed taking advantage of the complete modularity of the design (in two tier systems, the server is typically centralized). Once distributed, there is the possibility of combing components from many different systems. Middleware is used to connect the components.

Middleware¶

Middleware is a level of indirection between clients and other layers of the system.

An additional layer of business logic encompassing all underlying systems.
Simplifies the design of the clients by reducing the number of interfaces.
Acts as the platform for inter-system functionality and high-level application logic.
Takes care of locating resources, accessing them, and gathering results.

A middleware system is just a system like any other. It can be 1 tier, 2 tier, 3 tier, etc.

Advantages:

Reduce the number of necessary interfaces
- Clients and local applications see only system (the middleware)
Centralize control and provide a common integration platform
Make necessary functionality widely available to all clients
- Can implement functionality that otherwise would be difficult (e.g., transaction)
Help deal with application heterogeneity and integration

Disadvantages:

Yet another indirection level, which has extra complexity, extra latency
Needs to be standardized

Micro-services: death star architectures¶

Foundational trends towards micro-services¶

Organizational trends

Desire fro teams to work independently, want quick development, globalization of companies

Hardware trends

Death of Moore's law leads to need for parallelization

Basic idea: build apps composed of tiny pieces communicating over the network.

Components can scale independently and are implemented & managed by separate teams in the organization.

Example: search engine¶

Tail latency matters in cloud applications¶

Due to high fan-out in multi-tier cloud applications, it matters how long the slowest requests and services take.

Analyzing average latency is not enough. Service level objective is often expressed as p99 latency < X.

The choice the cloud made¶

Scale up:

Attaining higher capacity by making a single component larger
Worked well for decades: Moore's law and Dennar scaling
A single component cannot deal with today's loads
Other system considerations (reliability, replication, scalability)

Scale out

Attaining larger capacity by using many instances of the same (smaller component)
Relies on distribution (racks, clusters, data centers) and the necessary underlying infrastructure
Provides more flexibility (easier to provision and adjust to workload)
Introduces all the problems of distributed computing (additional infrasturcture, overheads)

Why scale out vs. just scale up?¶

No way to store all of the web in a single computer, scale up simply does not apply. Increases parallelism, parallel search allows to go through huge amounts of data in a shorter time. Best effort, if some nodes do not reply, there is still some answer (works in some applications).

Separation of compute and storage¶

Front end complexity¶

Storage layer: multiple data sources¶

Common optimizations: caching, load-balancing, etc.¶

Why such designs?¶

These designs are the result of a combination of

Increasingly complex systems (systems instead of programs)
The need to meet Service Level Agreemeents (performance, reliability, constraints)
Growing use of open-source systems for many tasks
Proliferation of specialized solutions (specialized databases)
The underlying hardware architecture

Cloud service also make such architectures easier