第 15 章 云计算架构的模式

CHAPTER 15 Patterns for Cloud Computing Architecture

There are no rules of architecture for a castle in the clouds.

G K Chesterton

15.1 Introduction Cloud computing is a new paradigm that improves the utilization of resources and decreases the power consumption of hardware. Cloud computing allows users to have access to resources, software and information using any device that has access to the Internet. The users consume these resources and pay only for the resources they use.

A cloud model provides three types of services: infrastructure-as-a-service (IaaS), platform-as-a-service (PaaS), and software-as-a-service (SaaS). IaaS provides processing, storage, networks and other fundamental computing resources on which the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. PaaS offers platform layer resources, including operating system support and software development frameworks to build, deploy and deliver applications into the cloud. SaaS provides end-user applications that are running on a cloud infrastructure. The applications are accessible from various client devices through a thin client interface such as a web browser (for example web-based e-mail).

We present here patterns for the three service levels of clouds:

Infrastructure-as-a-service [Has12a] describes the infrastructure to allow the sharing of distributed virtualized computational resources, such as servers, storage and networks. Platform-as-a-service [Has12a] provides virtual environments for developing, deploying, monitoring and managing applications online without the cost of buying and managing the corresponding software tools or hardware. Software-as-a-service [Has12d] provides a set of software applications available in a cloud system that can be accessed by client devices through the Internet. These are not strictly speaking security patterns, although they include security aspects in their definition. Our view is that to understand the security issues of clouds we need to look at their complete architecture. A fair amount of work has been done on specific aspects of the security of clouds, but we know of no effort to define a holistic view of their security. In a conceptual architecture we can apply the methodology we saw in Chapter 3 and which is illustrated with several examples in Chapter 16. We have not done this here, leaving it as an exercise for the reader, our point being that it should be fairly easy to apply this (or another) methodology to protect the cloud system.

Our methodology requires a systematic enumeration of threats, which we have already done [Has12c]. The next step is to consider each use cases and analyze the threats to its activities [Has12e]. We have also defined misuse patterns that describe how an attack (misuse of information) is performed from the point of view of the attacker ([Has12b], Chapter 14). This approach represents a variation of the methodology of Chapter 3 in that, instead of building a specific application, for example a financial application, we are building a distributed platform on which applications may execute. The security level of this platform contributes to the security level of the application, so an evaluation of the security of the platform must be combined with a security evaluation of the application.

USING THE PATTERNS FOR SECURING CLOUDS Chapter 14 showed some misuse patterns. By defining precisely the units of a cloud architecture, we can observe the progression of an attack through them and define ways to stop its advance. We have started building a catalog of cloud misuse patterns; with a complete catalog we can apply them systematically and use a reference architecture [Has13] to find where we should add corresponding security patterns to stop them. This work also implies developing some new security patterns for this purpose.

The reference architecture should also support the standards that apply to each service level. There are still no accepted standards for clouds, but NIST is leading some work in this direction [Hog11]. Some of the cloud services are XML web services and they should follow their security standards (Chapter 11 and Chapter 12). The increasing use of representational state transfer (REST) services, for which there is no security standard, implies that they will be handled in a mostly ad hoc fashion [Rod08]. The reference architecture and its security patterns should be valuable in providing designers with a systematic approach to handling both XML and REST-based types of web service.

Another use of the architecture is to provide a reference for security certification of services. Knowing the misuse patterns that affect a particular service, a provider can show that their service can handle the corresponding threats by incorporating appropriate security patterns, which would increase customer trust in their use.

Finally, patterns provide a way to evaluate the security of complete systems, by finding a matching security pattern to defend against each threat [Fer10a]. We can apply them to evaluate the security of cloud systems.

15.2 Infrastructure-as-a-Service The INFRASTRUCTURE-AS-A-SERVICE pattern describes the infrastructure to allow the sharing of distributed virtualized computational resources, such as servers, storage and networks.

CONTEXT Distributed systems in which we want to improve the utilization of resources and provide convenient access to all users.

PROBLEM Some organizations do not have the resources to invest in the infrastructure, middleware or applications that are needed to run their businesses. Also, they may not be able to handle increases in demand, or cannot afford to maintain and store unused resources. How can we provide these users with quality access to computational resources?

The solution to this problem must resolve the following forces:

Transparency. The underlying architecture should be transparent to its users. Users should be able to use the provider’s services without understanding its infrastructure. Flexibility. Different infrastructure configurations and resource volumes can be demanded by users. Elasticity. Users should be able to expand or reduce resources in order to meet the different needs of their applications. Pay-per-use. Users should only pay for the resources they consume. On-demand-service. Services should be provided on demand. Manageability. In order to manage a large volume of service requests, cloud resources must be easy to deploy and manage. Accessibility. Users should be able to access resources from anywhere at any time. Testability. We intend to develop system programs in this environment and we need to test them conveniently. Shared resources. Many users should be able to share resources in order to increase the volume of resource utilization and thus reduce costs. Isolation. Different user execution instances should be isolated from each other. Shared non-functional requirements provision (NFRs). Sharing of the costs of providing NFRs is necessary to allow providers to offer a higher level of NFRs. Security. The IaaS level is the basis for execution of the complete cloud system and its degree of security will affect all the applications running on it. We should provide a convenient and measurable structure to define security requirements. SOLUTION The solution to this problem is a structure that is composed of many servers, storage and a network, which can be shared by multiple users and is accessible through the Internet. These resources are provided to the users in the form of infrastructure-as-a-service (IaaS). IaaS is based on virtualization technology, which creates unified resources that can be shared by different applications. This foundation layer – IaaS – can be used as a reference for non-functional requirements.

STRUCTURE Figure 15.1 shows a class diagram for the cloud-based INFRASTRUCTURE-AS-A-SERVICE pattern. The CloudController is the main component, which processes requests from a Party. A Party can be an institution or a user (customers and administrators). A Party can have one or more Accounts. The CloudController coordinates a collection of services such as virtual machine (VM) scheduling, authentication, VM monitoring and management. When a CloudController receives a request from the Party to create a VM, it requests its corresponding ClusterControllers to provide a list of their free resources. With this information, the CloudController can choose which cluster will host the requested VM.

Figure 15.1: Class diagram for a cloud-based INFRASTRUCTURE-AS-A-SERVICE pattern

A ClusterController is composed of a collection of NodeControllers, which consist of a pool of Servers that host VM instances. The ClusterController handles the state information of its NodeControllers, and schedules incoming requests to run instances.

A NodeController controls the execution, monitoring, and termination of the VMs through a virtual machine monitor (VMM), which is responsible for running VM instances. The CloudController retrieves and stores user data and VMImages. The VMImageRepository contains a collection of VMImages that are used to instantiate a VM. The DHCP server assigns a MAC/IP (media access control/internet protocol) pair address for each VM through the CloudController, and requests the DNS server to translate domain names into IP addresses in order to locate cloud resources.

DYNAMICS Use cases include [Nis]:

Open/close an account (actor: user) Copy data objects into/out of a cloud (actor: administrator) Erase data objects in a cloud (actor: administrator) Store/remove virtual machine images (actors: administrator, user) Create a virtual machine (actor: user) Migrate a virtual machine (actors: administrator, user) We show two use cases below, ‘Create a virtual machine’ and ‘Migrate a virtual machine’.

Use Case: Create a Virtual Machine – Figure 15.2 Figure 15.2: Sequence diagram for the use case ‘Create a virtual machine’

Summary Create a virtual machine for a party, assign to it the required resources and assign it to a server. Actor Party. Precondition The Party has a valid account. Description 1 A Party requests a VM with some computational resources from the CloudController. 2 The CloudController verifies whether the requester has a valid account. 3 The CloudController requests the available resources from the ClusterController closest to the location of the Party. In turn, the ClusterController queries its NodeControllers about their available resources. (In the sequence diagram, there is only one ClusterController and one NodeController to keep the diagram simple, but there can be more cluster and node controllers.) 4 The NodeController sends the list of its available resources to the ClusterController, and the ClusterController sends it back to the CloudController. 5 The CloudController chooses the first ClusterController that can support the computational resource requirements. 6 The CloudController requests a MAC/IP pair address from the DHCP server for the new VM. 7 The CloudController retrieves a VM image from the VMRepository. 8 The CloudController sends a request to the ClusterController to instantiate a VM. 9 The ClusterController forwards the request to the NodeController, which forwards it to the VMM (virtual machine monitor). 10 The VMM creates a VM with the requested resources. 11 The VMM assigns the VM to one of the servers. Postcondition A virtual machine is created and assigned to an account and a server. Use Case: Migrate a Virtual Machine – Figure 15.3 Figure 15.3: Sequence diagram for the use case ‘Migrate a virtual machine’

The administrator can migrate a VM to a specific node controller that can be located in the same or in a different cluster controller. The administrator can also migrate a VM to a specific location, or to the first node that has the available resources. For the scenario below, we assume that the administrator will move a VM to the first available node controller within the same cluster controller. However, the migration process can be automatic, for example due to load balancing.

Summary A virtual machine is migrated from one node controller to another. Actor Administrator. Precondition A VM resides on some NodeController. Description 1 The Administrator requests the CloudController to migrate a VM. 2 The CloudController sends a request to the ClusterController to start the migration of the VM. 3 The ClusterController requests the NodeControllerSource to stop the VM. The NodeControllerSource forwards this request to the VMMSource. 4 The VMMSource stops the VM and copies the content of the VM. 5 All the steps of the use case Create a Virtual Machine – Figure 15.2 are carried out. 6 The VMMSource sends the content of the VM to the VMMDestination. 7 The VMMDestination copies the content into the new VM. Postcondition The virtual machine has migrated to another host. IMPLEMENTATION As an example, we show the implementation of one of the known uses of this pattern. There are many ways to implement our conceptual models; this is just one possible way to do it. Eucalyptus [Euc] is open source software that allows IaaS to be implemented in order to run and control virtual machine instances via Xen and KVM. Eucalyptus consists of five main components that are described in Figure 15.4 [Bau09].

Figure 15.4: Eucalyptus’ main components

The two higher-level components are the Cloud Controller and Walrus. The Cloud Controller is a Java program that offers EC2-compatible [Ama] SOAP and web interfaces. Walrus is a data storage system where users can store and access virtual machine images and their data. Walrus can be accessed through S3-compatible SOAP and REST interfaces. Top-level components can aggregate resources from several clusters.

Each cluster needs a Cluster Controller, which is typically deployed on the head node of a cluster. Each node will also need a Node Controller for controlling the VMM. Cluster Controller and Node Controllers are deployed as web services, and communications between them takes place over SOAP with WS-Security [Has09c].

A cloud can be set up as a single cluster in which the Cloud Controller and the Cluster Controller are located on the same machine, which are referred to as ‘front-end’. All other machines running the Node Controllers are referred to as ‘back-end’. However, a more advanced configuration is possible, comprising several Cluster Controller or Walrus’ deployed on different machines.

A typical configuration around 2012 includes [Ubu]:

One cloud controller (CPU 1GHz, memory 512MB, disk 5400rpm IDE, disk space 40GB) One Walrus controller (CPU 1GHz, memory 512MB, disk 5400rpm IDE, disk space 40GB) One cluster controller plus storage Controller (CPU 1GHz, memory 512MB, disk-5400rpm IDE, disk space 40GB) Nodes (virtualization technology extensions, memory 1GB, disk 5400rpm IDE, disk space 40GB) CONSEQUENCES The INFRASTRUCTURE-AS-A-SERVICE pattern offers the following benefits:

Transparency. Cloud users are usually not aware of where their virtual machines are running or where their data is stored. However, in some cases users can request a general location zone for virtual machines or data. Flexibility. Cloud users can request different types of computational and storage resources. For instance, Amazon’s EC2 [Ama] provides a variety of instance types and operating systems. Elasticity. Resources provided to users can be scaled up or down depending on their needs. Multiple virtual machines can be initiated and stopped to handle increased or decreased workloads. Pay-per-use. Cloud users can save on hardware investment because they do not need to purchase more servers; they just need to pay for the services that they use. Cloud services are usually charged using a fee-for-service billing model [Cen10]. For example, users might pay for the storage, bandwidth or computing resources they consume per month. On-demand-services. IaaS providers deliver computational resources, storage and network as services at users’ request. Manageability. Users place their requests with the cloud administrator, who allocates, migrates and monitors VMs. Accessibility. Cloud services are delivered using user-centric interfaces via the Internet [Wan08] from anywhere and at any time. Testability. Having an environment isolated in a virtual machine allows the testing of system programs without affecting the execution of other virtual machines. Shared resources. Virtualization enables sharing a pool of resources such as processing capacity, storage and networks to be shared, so that a higher utilization rate can be achieved [Amr]. Isolation. A VMM provides strong isolation between different virtual machines, whose guest operating systems are then protected from one another [Kar08]. Shared non-functional requirements (NFRs) provision. Some IaaS providers offer security features such as authentication and authorization to customers, which can be added as part of the service. Sharing allows the provider to offer a higher degree of NFRs at a reasonable cost. Security. Security defenses can be defined with respect to the architecture. For example, connection of users to the cloud controller may be mutually authenticated to avoid imposters from either side. The pattern also has the following potential liabilities:

Cloud computing is dependent on network connections. While using cloud services, users must be connected to the Internet, although a limited amount of work can be done offline. The cloud may bring security risks associated with privacy and confidentiality, since users do not have control of the underlying infrastructure. The isolation between VMs may not be strong [Has12a]. Virtualization introduces some performance overhead. KNOWN USES Eucalyptus [Euc] is an open source framework used for hybrid and private cloud computing. OpenNebula [Ope2] is an open source toolkit for building clouds. Nimbus [Nimb] is an open source set of tools that offers IaaS capabilities to the scientific community. Amazon’s EC2 [Ama] provides computing capacity though web services. HP Cloud Services [Hp] is a public cloud solution that provides scalable virtual servers on demand. IBM SmartCloud Foundation [IBMa] offers servers, storage and virtualization components for building private, public and hybrid clouds. SEE ALSO The VIRTUAL MACHINE OPERATING SYSTEM ARCHITECTURE pattern ([Fer05c] and page 179) describes the VMM and its created VMs from the point of view of an operating system architecture. The Grid architectural pattern [Cam06] allows the sharing of distributed and heterogeneous computational resources such as CPU, memory and disk storage for a grid environment. Misuse patterns in [Has12a] describe possible attacks to cloud infrastructures. The PLATFORM-AS-A-SERVICE (PaaS) pattern (page 423) describes development platforms that provide virtual environments for developing applications in the cloud. The Party pattern [Fow97] indicates that users can be individuals or institutions. Several of the patterns shown earlier in this book can be used to protect different aspects of the cloud system. 15.3 Platform-as-a-Service The PLATFORM-AS-A-SERVICE pattern describes how to provide virtual environments for developing, deploying, monitoring and managing applications online without the cost of buying and managing the corresponding software tools or hardware.

CONTEXT PaaS services are built on top of the cloud’s infrastructure-as-a-service (IaaS) features, which provides the underlying infrastructure.

PROBLEM Organizations may want to develop their own custom applications without buying and maintaining the developing tools, databases, operating systems and infrastructure underlying them. Also, when a team is spread across several locations, it is necessary to have a convenient way to coordinate their work. How can we provide secure PaaS functions?

The solution to this problem must resolve the following forces:

Collaboration. Sometimes teams of developers are located in different geographic locations. When working on a project, they all should have access to the development tools, code and data. Coordination. When many developers work on a complex project, they need to coordinate their work. Elasticity. There should be a way to increase or decrease resources for more compute-intense development and deployment tasks. Pay-per-use. Parties should only pay for the resources that they use. Transparency. Developers should not have to be concerned about the underlying infrastructure, including hardware and operating systems, and its configuration for development and deployment. On-demand services. Developers should be able to request an application tool and start using it. Accessibility. Developers should be able to access tools via standard networks, from anywhere at any time. Testability. We intend to develop application programs in this environment and we need to test them conveniently. Versatility. The platform should be able to be used to build applications for any domain or type of application. Different options for developing tools should be offered to the users. Simplification. Developers should be able to build applications without installing any tool or specialized software on their computers. Security. The platform should offer facilities for developing secure programs, and should itself be protected from attacks. SOLUTION PaaS offers virtual execution environments with shared tools and libraries for application development and deployment into the cloud. PaaS uses IaaS as a foundation layer (servers, storage and network), and hides the complexity of managing the infrastructure underneath.

STRUCTURE Figure 15.5 shows a class diagram for a cloud-based platform-as-a-service. The PaaSProvider processes requests from Parties. A Party can be an institution or a user (developers, administrators). The Party will choose the development tools from the SoftwareRepository, which contains a list of available tools. The PaaSProvider offers VirtualEnvironments such as DevelopmentEnvironment and DeploymentEnvironment. The DevelopmentEnvironment is composed of DevelopmentTools, Libraries, Databases. The VirtualEnvironments are built on the IaaS features, which provide the underlying hardware. The same PaaSProvider can manage the IaaS, or it can be managed by a third-party service provider.

Figure 15.5: Class diagram for the PLATFORM-AS-A-SERVICE pattern

DYNAMICS Use cases include the following [Dod10]:

Open/close an account Request a virtual environment Use a virtual environment Install development software Deploy an application Undo deploying an application Consume development software – Figure 15.6, below Figure 15.6: Sequence diagram for the use case ‘Consume development software’

Use Case: Consume Development Software – Figure 15.6 Summary A party requests to the use of a development application for the first time. Actor Party. Precondition The Party has an account. Description 1 The Party requests the use of specific development software. 2 The PaaSProvider checks whether the Party has a valid account. 3 The Party downloads the client applications onto its machine. Postcondition The client application is downloaded on the party’s machine. Use Case: Deploy an Application – Figure 15.7 Figure 15.7: Sequence diagram for the use case ‘Deploy an application’

Summary A party requests deployment of an application into the cloud, so that the application can be accessed by end-users from anywhere at any time. Actor Party (developer). Precondition The Party has an account. Description 1 A Party asks to deploy their application into the cloud. 2 The PaaSProvider checks whether the Party has a valid account. 3 The PaaSProvider calculates the computational resources needed for the deployment, such as the number of virtual machines. 4 The PaaSProvider asks the IaaS to create a set of virtual machines (VEs). 5 The PaaSProvider installs and runs the code. Postcondition The application is running and ready to be accessed by the end users. IMPLEMENTATION As an example of the implementation of a typical PaaS approach, we describe the approach used by Force.com [Sal]. Force.com is a cloud platform-as-a-service system from Salesforce.com. Force.com’s platform provides PaaS services as a stack of technologies and services covering infrastructure, database as a service, integration as a service, logic as a service, user interface as a service, development as a service, and AppExchange [Sal2], to enable the creation of business applications.

Figure 15.8 shows the stack of Force.com’s technologies and services, which includes:

Figure 15.8: The Force.com stack and services (from [Sal2])

Infrastructure. The foundation of the Force.com platform is the infrastructure that supports the other layers. Force.com uses three geographically dispersed data centers and a production-class development laboratory which use replication to mirror the data at each location. Database as a service. Customers can create customized data objects, such as relational tables, and use metadata to describes those objects. Force.com provides data security by offering features such as user authentication, administrative permissions, object-level permissions and field-level permissions. Integration as a service. Force.com provides integration technologies that are compliant with open web services and service-oriented architecture (SOA) standards, including SOAP, WSDL and WS-I Basic Profile [Fer10b]. Force.com offers different prepackaged integration solutions, such as Web Services API, Web Services Apex, callouts and mashups, and outbound messaging. Logic as a service. Force.com provides three options for implementing an application’s business processing logic: declarative logic (unique fields, audit history tracking, history tracking and approval processes), formula-based logic (formula fields, data validation rules, workflow rules and approval processes), and procedural logic (Apex triggers and classes). User interface as a service. Force.com provides two types of tools for creating the user interface of applications built on the platform applications: Force.com’s Builder and Visualforce. Builder creates metadata, which Force.com uses to generate a default user interface for each database object, with its corresponding methods such as create, edit and delete. With Visualforce, developers can use standard web development technologies such as HTML, Ajax and Adobe Flex to create user interfaces for their cloud applications. Development as a service. Force.com offers some features to create cloud applications: Metadata API, Integrated Development Environment (IDE), Force.com Sandbox and Code Share. Metadata API allows modification of the XML files that control an organization’s metadata. The IDE provides a code editor for adding, modifying and testing Apex applications. Apex is the Force.com proprietary programming language. Multiple developers can share a code source repository using the synchronization features of the IDE. Force.com Sandbox provides a separate cloud-based application environment for development, quality assurance and training. Force.com Code Share allows developers from different organizations to collaborate on the development, testing and deployment of cloud applications. Force.com IDE [Sal3] is a client application for creating, modifying, and deploying applications. Once the user downloads the IDE to their local machine, they can start coding. The IDE is in communication with the Force.com platform servers. There are two types of operations: online and offline. For example, in the online mode, when a class is saved, the IDE sends the class to the Force.com servers that compile the class and return any result (error message). In the offline mode, all changes are performed on a local machine, and once connected to Force.com gain, the changes are submitted and committed. Force.com provides built-in support for automated testing. Once an application is developed in the development environment, it may be migrated to another environment, such as testing, or production. Application exchange. AppExchange is a cloud application marketplace where users can find applications that are delivered by partners or third-party developers. Force.com offers environments [Sal4] where users can start developing, testing and deploying cloud computing applications. There are different types of environments, such as production, development and test environments. The production environment stores live data, while the development environment stores test data and is used for developing and testing applications. The development environment has two types: Developer Edition and Sandbox. Sandbox is a copy of the production environment that can include data, configurations, or both. A Developer Edition environment [Sal5] includes the following developer technologies: Apex programming language, Visualforce for building custom user interface and controllers, the Integration APIs, and more. Figure 15.9 shows the platform for the Developer Edition environment. The Force.com’s virtual environments run on Salesforce’s infrastructure. Figure 15.9: Class diagram of Force.com’s PaaS architecture

Force.com uses various security techniques to defend its platform from different types of threats [Sal6]:

User authentication: most users are authenticated on the login page, but there are also other forms of user authentication, such as delegated authentication and Security Assertion Markup Language (SAML). An authenticated session needs to be established before accessing the Force.com SOAP API and Metadata API. Force.com secures its network using various mechanisms, such as stateful packet inspection (SPI), bastion hosts, two-factor authentication processes and end-to-end TLS/SSL cryptographic protocols. For sensitive data such as customer passwords, Force.com applies an MD5 one-way cryptographic hash function, and supports encryption of field data. At an infrastructure and network level, Salesforce.com applies rigorous security standards, such as SysTrust SAS 70 Type II. Salesforce.com implements industry best practices to harden the host computers. For example, all hosts use Linux or Solaris distribution with non-default configurations and minimal processes, user accounts and network protocols. CONSEQUENCES The PLATFORM-AS-A-SERVICE pattern offers the following benefits:

Collaboration. Geographically dispersed developers can collaborate on the same project because the code is managed online [Law08]. Coordination. A project can be conveniently administered from a central point. Elasticity. The resources (storage, networking resources and servers) needed to develop and deploy an application can grow or shrink to accommodate varying workload volumes. Scaling application deployments horizontally by replicating application components such as application servers and data stores is also possible. Pay-per-use. Users only pay for the services they consume, and do not need to buy any development tools or full year licenses. Transparency. The PaaS provider manages upgrades, patches and other maintenance, as well as the infrastructure. Users do not need to worry about compatibility issues between the server configurations and the development software. On-demand services. PaaS providers offer software development tools that can be used by developers when needed. Accessibility. PaaS services are accessed through the Internet via web browsers from anywhere at any time. Testability. The variety of tools available makes testing application programs more convenient in this environment. Versatility. PaaS offers various programming languages and databases. For example, with Microsoft Azure you can build applications using .NET, Java, PHP and others. Simplification. Developers do not need to buy or install any development tools, or to keep the servers updated. The development tools are managed and maintained by the PaaS providers. Security. The development tools offered should include tools to develop, test and deploy secure applications, supporting some secure methodology [Uzu12c]. The platform itself should have protection against its identified threats. The pattern also has the following potential liabilities:

PaaS providers usually offer their own proprietary development software, which makes it hard to migrate an application from one PaaS vendor to another. Also, APIs from different providers vary, which raises portability issues. The availability of the PaaS products depends mostly on the Internet. Thus, the services are available only as long there are network connections. A PaaS provider can either own or subcontract the underlying infrastructure from an IaaS provider. In either case, the security or availability of PaaS services may not be assured. Unscheduled upgrades of cloud-based software can be disruptive. KNOWN USES Google App Engine [Goo2] provides an environment for building and hosting web applications on Google’s infrastructure. Google App Engine supports two application environments: Java and Python. Microsoft Azure [Micb] provides a platform to build, deploy and manage applications. It provides various programming languages, such as .NET, Java, PHP and others, to build applications. Salesforce [Sal] offers a development platform for building custom applications. (See the Implementation section above) IBM SmartCloud Applications Services [Dod10] delivers a collaborative environment that supports the full lifecycle for software development, deployment and delivery. SEE ALSO The INFRASTRUCTURE-AS-A-SERVICE pattern (page 413) describes the infrastructure to allow sharing of distributed virtualized computational resources. Misuse Patterns in [Has12a] describe possible attacks to cloud environments, which may affect the security of PaaS. The Cloud Computing: Platform as a Service (PaaS) pattern [Nex10] describes execution environments for PaaS applications. The Party pattern [Fow97] indicates that users can be individuals or institutions. Several of the patterns presented earlier can be used to protect the platform. The catalog of patterns in Chapter 17 can be used to provide guidance about what to include in a development tool. 15.4 Software-as-a-Service The SOFTWARE-AS-A-SERVICE pattern describes how to provide a set of software applications available in a cloud system that can be accessed by client devices through the Internet.

EXAMPLE Bob has a small business that sells services. Currently, he has three salespeople who are offering the services and meeting potential customers. He is thinking of expanding his company and hiring more salespeople, which will make it more difficult to track the sales and potential customer’s information.

CONTEXT SaaS applications are hosted by a provider and accessed through the Internet via user interfaces or APIs.

PROBLEM Customers may need to use software products that do not require local installation and maintenance of the software. How can software be delivered over the network?

The solution to this problem must resolve the following forces:

Pay-per-use. Customers should be charged on a per-use basis, like utility services. Transparency. Customers should not be concerned about maintenance or updates to the software. On-demand services. Customers should have the ability to start using an application when they need to. Accessibility. Customers should be able to access the software applications at any time and anywhere. Flexibility. Customers should be able to configure the software application to their needs, such as currency or date formats. Elasticity. Applications should be able to scale down or up depending on the customers’ needs [Ju10]. For example, it should be possible to increase or decrease the number of users using the application. Simplification. Customers should not need to install any special software on their local machine. Security. The software offered to users must be secure: it should be built following a secure methodology [Uzu12c]. SOLUTION SaaS applications are delivered as a service to users typically thorough the Internet via web browsers or APIs. SaaS based in the cloud enables users to access applications on demand, in which both computation and storage are hosted in the cloud without installing any software on their local machines. SaaS can be developed and deployed using underlying platform-as-a-service (PaaS) or infrastructure-as-a-service (IaaS) offerings.

STRUCTURE Figure 15.10 shows the class diagram for software-as-a-service. A SaaSProvider processes requests from Parties. A Party can be either a User or a group of users (an Institution). A Party can have one or more Accounts. A SaaSProvider offers a set of SaaSApplications. The SaaSCatalog contains the list of SaaSApplications that are offered to the users. There can be a single AppInstance of the application that is shared by different users, or a single instance per user. The SaaSApplications resides on a Platform. SaaSApplications can be deployed using underlying platform-as-a-service or infrastructure-as-a-service offerings. The platform can be owned or rented to a third-party provider.

Figure 15.10: Class diagram for the SOFTWARE-AS-A-SERVICE pattern

DYNAMICS The set of use cases includes the following [Spe09]:

Open/close an account (actor: party) Set up an application (actor: SaaS provider) Meter usage (actor: SaaS provider) Subscribe to an application (actor: SaaS provider) Consume an application (actor: party) We show the last two of these use cases in detail below.

Use Case: Subscribe to an Application – Figure 15.11 Figure 15.11: Sequence diagram for the use case ‘Subscribe to an application’

Summary A user asks to buy a subscription in order to access an application from the SaaS provider Actor User. Precondition The User has a valid account. Description 1 A User asks to subscribe to an application. 2 The SaaSProvider checks whether the User has a valid account. 3 The SaaSProvider creates an instance of the software application. In this case we assume that each application instance serves one party. 4 The SaaSProvider acknowledges the User that his subscription to the application is complete. Postcondition The user can start using the application. Use Case: Consume an Application – Figure 15.12 Figure 15.12: Sequence diagram for the use case ‘Consume an application’

Summary A user asks to use an application for which they are already subscribed. Actor User. Precondition The User has an account and they have subscribed to the application. Description 1 A User asks to consume an application to which they are subscribed. 2 The SaaSProvider checks whether the user has a valid account. 3 The User starts consuming the application. 4 The User asks for their data to be saved. Postcondition The User’s data is stored in the cloud. IMPLEMENTATION SaaS can be categorized into four distinct levels [Per10]. The first-level services are ad hoc/custom, in which each customer has their own customized version of the hosted application. For the second-level services, configurable, the SaaS provider’s servers host a separate instance of the application for each customer, similarly to the previous level. However, the instances are not customized for each customer, but provide some configuration options.

The third-level services are configurable, multi-tenant efficient, in which a single instance of the software serves all customers, with configurable metadata. At the fourth level, scalable, configurable, multi-tenant-efficient, multiple identical instances are controlled by the load balancer.

In order to manage multi-tenant data, there are three approaches for databases [Liu10]. The simplest approach is to store data in different databases for each customer. For the second approach, the same database hosts multiple customers’ data, where each customer has their own tables and schema. In the third approach, customers’ data is stored in the same database and set of tables.

In Salesforce’s model, a single instance of an application is shared among many customers, and customers’ data is stored in a shared database. For customization purpose, metadata can be used to configure the way in which an application appears and behaves, such as appearance of the screen and data fields.

EXAMPLE RESOLVED Bob has two options:

Bob has to make an initial investment in infrastructure, such as the hardware, middleware and software needed for an application that can track sales, customer’s information, potential customers and reports. He has also to be in charge of the maintenance of the equipment and the necessary software to run the application. The problem with this option is that if demand decreases, there are still operational costs for unused resources. The benefit of this option is that he can have control over the underlying infrastructure as well as his data. Bob can subscribe to SaaS application hosted by a cloud provider. The cloud provider takes control of the software application. The SaaS provider may rent the infrastructure to a third-party provider, which makes the process of compliance more complicated. The location where the data is processed or stored may be uncertain, which may raise data privacy issues. Since SaaS solutions are web-hosted, they have to be accessed by an Internet connection, which can be insecure. Sensitive data has to be stored online on provider’s servers. If the provider goes into bankruptcy, lock-in can be a possible issue. CONSEQUENCES The SOFTWARE-AS-A-SERVICE pattern offers the following benefits:

Pay-per-use. SaaS providers often charge for their applications based on some parameters such as number of users. For example, Salesforce’s Enterprise CRM costs $125 per user per month. Google Apps for business costs $5 per user per month, but it is free for individuals and small teams. Transparency. SaaS applications are deployed, supported, maintained and upgraded by the provider. Due to the fact that SaaS applications are hosted in the cloud, updates and upgrades are available immediately to the users [Ju10]. Users typically do not need to install or set up any application on their local machines. Also, SaaS applications can be used from any operating system. On-demand services. SaaS applications can be used as soon as they are needed. For example, to get access to Google Gmail, a user just opens a browser, logs into their account, and starts using the application. Accessibility. SaaS applications can be accessed across the Internet by a user at any time. Flexibility. SaaS applications can be customized to some degree, depending on how they were designed. Not all providers offer customization. For example, a customer may be able to modify a page layout. Elasticity. SaaS applications are hosted in the cloud, so users do not need to install them on their local machines. Simplification. Typically SaaS applications are accessed through web browsers or APIs, which do not require specific software on the client. Security. It is possible to deploy secure applications if they have been built using a secure development methodology [Uzu12c]. The pattern also has the following potential liabilities:

There are some applications that demand high user interaction; in such cases the SaaS model may not be suitable because of the network latency. Since customers’ data is stored on the vendor’s servers, or even on a third-party’s servers, data security becomes an issue. The network used to accessed SaaS applications, such as the Internet, can be insecure. This can raise security issues such as integrity and confidentiality. SaaS applications typically are unique to each provider, which makes it harder for users to switch to a different vendor. SaaS applications may be updated frequently by their providers, which may make it difficult for users to manage the integration of SaaS applications with their business processes. Unplanned upgrades can be a disadvantage, especially if they impose an unscheduled training requirement on the customer. Cloud applications may introduce compliance issues because the users’ data is stored and managed by the provider. If the provider goes into bankruptcy, lock-in can be a possible issue. KNOWN USES Salesforce.com’s CRM (Customer Relationship Management) [Sal] is online web-based software that records, tracks, manages and analyzes sales data. Google applications such as Gmail, Google Calendar and Google Docs [Goo1] are web-based applications that can be accessed through different thin clients with Internet connection. Freshbooks.com [Freb] is an online invoicing service intended mainly to serve small businesses. IBM SmartCloud Solutions [IBMa] provides a set of software and business processes delivered by IBM as a service, including Business Analytics and Optimization, Social Business, Smarter Commerce and Smarter Cities. SEE ALSO The INFRASTRUCTURE-AS-A-SERVICE pattern (page 413) describes the infrastructure to allow sharing of distributed virtualized computational resources [Has12a]. The PLATFORM-AS-A-SERVICE pattern (page 423) describes virtual environments for developing, deploying, testing, and managing applications online [Has12a]. The Misuse patterns in [Has12b] describe possible attacks to cloud environments. The Party pattern [Fow97] indicates that users can be individuals or institutions. The patterns in this book and our secure methodology (Chapter 3) can offer an effective way to make SaaS secure.