KEMBAR78
Mocking SaaS Cloud for Testing | PDF | Cloud Computing | Unit Testing
0% found this document useful (0 votes)
24 views59 pages

Mocking SaaS Cloud for Testing

This thesis evaluates the impact of using a mock-object for software testing in place of cloud data accessed via an API, specifically focusing on a case study involving an integration system between Novaschem and Google Calendar. The study identifies challenges associated with cloud-based testing and assesses whether mock-objects can effectively replicate the fault detection and code coverage of cloud testing. Results indicate that mock-objects are beneficial for testing by simplifying cleanup, state triggering, and bypassing query limitations.

Uploaded by

tmohamedeg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views59 pages

Mocking SaaS Cloud for Testing

This thesis evaluates the impact of using a mock-object for software testing in place of cloud data accessed via an API, specifically focusing on a case study involving an integration system between Novaschem and Google Calendar. The study identifies challenges associated with cloud-based testing and assesses whether mock-objects can effectively replicate the fault detection and code coverage of cloud testing. Results indicate that mock-objects are beneficial for testing by simplifying cleanup, state triggering, and bypassing query limitations.

Uploaded by

tmohamedeg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

DEGREE PROJECT FOR MASTER OF SCIENCE IN ENGINEERING

GAME AND SOFTWARE ENGINEERING

Mocking SaaS Cloud for


Testing

Johannes Henriksson | Simon Svensgård

Blekinge Institute of Technology, Karlskrona, Sweden, 2017

Supervisor: Mikael Svahnberg, Department of Software Engineering, BTH


Abstract

In this paper we evaluate how software testing is affected by the usage of a mock-object, a
dummy implementation of a real object, in place of having data in a cloud that is accessed
through an API. We define the problems for testing that having data in the cloud brings, which of
these problems a mock-object can remedy and what problems there are with testing using the
mock-object. We also evaluate if testing using the mock-object can find the same faults as testing
against the cloud and if the same code can be covered by the tests. This is done at Blekinge
Institute of Technology(BTH) by creating an integration system for the company Cybercom
Sweden and Karlskrona Municipality. This integration system is made in C] and works by
syncing schedules from Novaschem to a cloud service, Google Calendar. With this paper we
show that a mock-object in place of a cloud is very useful for testing when it comes to clean-up,
triggering certain states and to avoid query limitations.

Keywords: Mock-Object, Cloud Computing, Testing, Test-Evaluation

i
Sammanfattning

I detta arbete utvärderar vi hur programvarutestning påverkas av användandet av ett mock-objekt,


en dummy-implementation av ett riktigt objekt, istället för att ha data i ett moln som man
kommer åt via ett API. Vi definierar de problem som uppkommer av att ha data i molnet, vilka
problem som kan avhjälpas av mock-objektet och vilka problem mock-objektet medför. Vi
utvärderar även om testning med ett mock-objekt kan finna samma fel som testning mot molnet
och om samma kod kan täckas av testerna. Detta görs på Blekinge Tekniska Högskola(BTH)
genom att skapa ett integrationssystem för företaget Cybercom Sweden och Karlskrona Kommun.
Integrationssystemet görs i C] och fungerar som så att det synkar scheman från Novaschem till
en molntjänst, Google Calendar. Med detta arbete visar vi att ett mock-objekt istället för molnet
är väldigt användbart när det kommer till städning efter tester, att utlösa vissa tillstånd och för att
undvika begränsningar.

Nyckelord: Mock-Objekt, Molntjänster, Testning, Testutvärdering

iii
Preface

This thesis is the final step for our master in game and software engineering at BTH and represents
20 weeks of full time study.

At BTH, we would like to thank our supervisor Mikael Svahnberg and our examiner Emil
Alégroth for valuable feedback during the project. As well as Bogdan Marculescu for help with
insight into the world of academia.

We want to thank Alexander Andersson and Johan Persbeck at Cybercom Sweden for providing
us with the idea and assistance throughout the project.

We also want to thank those who have read and given feedback on this thesis.

v
Nomenclature

Acronyms

API Application Programming Interface


AWS Amazon Web Services
FITTEST Future Internet Testing
IaaS Infrastructure as a Service
NIST National Institute of Standards and Technology
PaaS Platform as a Service
SaaS Software as a Service
SUT System Under Test
UI User Interface

vii
Table of Contents

Abstract i
Sammanfattning (Swedish) iii
Preface v
Nomenclature vii
Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Table of Contents ix
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Thesis question and technical problem . . . . . . . . . . . . . . . . . . . . . . 2
2 Theoretical Framework 3
2.1 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Test Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Method - Design 11
3.1 Literature Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Interview and Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.4 Test Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Method - Execution 15
4.1 Literature Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Interview and Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 System Under Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 Test Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.6 Validity Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Results 23
5.1 Research Question 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2 Research Question 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Research Question 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Discussion 29
6.1 Coverage Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.2 Mutation Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.4 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.5 When to use a mock-object . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.6 Sustainability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7 Conclusions 33
8 Recommendations and Future Work 35
References 37

ix
A Interview Developer Cybercom A-1
B Example unit tests from Test Suite A-5
B.1 EventExists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
B.2 NewActivityAdded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
C Example Mock-Object Unit Test A-7

x
1 INTRODUCTION

1.1 Introduction

Cloud computing [3] has been on the rise over the past decade[18] with a heavy increase of
its use in the software industry. A cloud is to put it simply, computer functionality that can be
reached over network. Cloud computing can, depending on the cloud service, be used to host
executable code and/or data through different ways of access; everything from simple Application
Programming Interfaces (APIs) or User Interfaces (UIs) that give the user very little control to
full control over the system. The focus of this paper is the highest abstraction level of cloud
services, Software as a Service (SaaS), where the user have the least control over the cloud
system.

When it comes to testing of a system that uses data that is stored on a SaaS cloud, there are some
extra challenges compared to having the data in a database. Since a SaaS cloud can only be
accessed at a high abstraction level, such as through an API, the available requests can be limited
both in how many requests can be sent and the amount of different requests; which limits the
testing that can be done. This area of testing is very relevant to modern internet and not that
much have been written about it before which makes it an area that have potential to evolve.

This study is focused on evaluating how testing such a system will differ when using a mock of
the SaaS in the form of a database compared to regular testing of the system. This is done in two
main parts, with the first part being a classification of differences when testing an application
that make use of a SaaS cloud and a database respectively to store data. This classification is
based on a literature study as well as observations and an interview at Cybercom.

The second part is done by implementing an integration system between a scheduling service
by Nova Software, Novaschem[26], and a SaaS, Google Calendar[12]. This is done to see in
practice what difficulties there are when it comes to testing this kind of system and if a mock of
the SaaS cloud is useful in a concrete scenario. In addition to that the testing is evaluated using
three test measurements, Block coverage, Branch coverage and Condition coverage.

With respect to cloud testing, there is plenty of research discussing how to use the cloud for
testing[34][16]. However, very little is written about testing of applications that use cloud
services. This study is made to see if there are any specific problems and if a mock pattern can
be used to make testing easier and in what way.

1.2 Background

Cybercom Group Karlskrona have been tasked by the municipality of Karlskrona with creating
an integration system between Novaschem, a scheduling system by Nova Software, and Google
Calendar, a SaaS cloud. To make sure the system is robust the system needs to be well tested.
Since Cybercom does not have any special routines for testing applications that are using data in
a cloud and there are some challenges within this area that we thought could be explored more in
depth with the possibility of finding a better testing routine.

1
1.3 Objectives
The goal of this thesis project is to create a stable integration system between a scheduling service
called Novaschem and Google Calendar. This integration system needs to be tested thoroughly to
ensure that it is stable and functional. When testing two different approaches will be compared,
regular unit testing against Google Calendar, the SaaS cloud, and unit testing using a database as
a mock object. These two approaches will be evaluated and compared both in terms of what
challenges and experiences from applying the techniques, but also by measuring different unit
test coverages for the two test suites.

1.4 Delimitations
The implementation of the integration system which will be used to compare and evaluate the
testing will make use of a single SaaS, Google Calendar, and results could vary depending on
what SaaS is used. However some general conclusions can be drawn from the results when
taking into account the other data sources for the case study, such as the theoretical framework
and looking at documentation of other SaaSs services. When looking at these SaaS we limit
ourselves to 3rd party public SaaS clouds.

The study is limited to only unit testing and how using a database mock object in place of a SaaS
can be used for this.

1.5 Thesis question and technical problem


The technical problem that has led to the research questions for this thesis is the implementation
of a robust integration between a scheduling system called Novaschem by Nova Software and the
cloud calendar system by Google, Google Calendar.

RQ1: When it comes to testing, what are the main differences between an application using
SaaS instead of databases for data storage?

RQ2: What are the challenges and experiences when unit testing against a mock-object compared
to unit testing against the actual API when testing a cloud integration application?

RQ3.1: Given the proposed integration system, will the test coverage achieved differ between
testing using the SaaS and using the mock-object?

RQ3.2: Given the proposed integration system, will the found defects differ between testing
using the SaaS and using the mock-object?

2
2 THEORETICAL FRAMEWORK

The main theoretical parts of this study is cloud computing with a focus on SaaS clouds; testing
using unit testing and mock-objects; and test evaluation using code coverage and mutations.

When it comes to testing and clouds there are many scientific papers about testing applications
by using the cloud for computing power, called Testing as a Service(TaaS),[34][43] or testing of
systems located on a cloud, SaaS applications[16][32]. This part of testing and clouds are not
relevant for this study and are not included in the theoretical framework.

When it comes to specifically testing of systems that are using data in a SaaS cloud there is less
material available. The framework thus contains separate research of cloud computing and mock
testing as well as FITTEST research project that is a more general research project spanning
more than only SaaS clouds.

2.1 Cloud Computing


When talking about cloud computing, different researchers give different definitions of what it
actually means.

In an article by Armbrust et al.[4] at UC Berkeley they defined cloud computing as the sum of SaaS
and Utility Computing. SaaS refers to software or services being delivered to customers across the
internet, with the software running on datacenters instead of client computers. Utility computing
deals with the sale of computational resources, where clients are charged for computation time
instead of having flat rates for renting actual hardware. Armbrust et al.[4] also limits the definition
of cloud computing to only include public clouds, which means that the systems are made
available to the general public in some form, removing businesses internal datacenters from the
cloud computing definition.

National Institute of Standards and Technology (NIST) in USA have also made a definition of
cloud computing which is more extensive, it states:

"Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to
a shared pool of configurable computing resources (e.g., networks, servers, storage, applications,
and services) that can be rapidly provisioned and released with minimal management effort or
service provider interaction. This cloud model is composed of five essential characteristics, three
service models, and four deployment models." [22]

The five essential characteristics are:

• On-demand self-service, which means that the consumers should be able to provision
computing server time, network storage and similar on their own without contacting service
providers.
• Broad network access, which means that the service should be available over the network
and accessible on different client platforms and devices such as phones, tablets, laptops or
computers.
• Resource pooling, which means that the providers computing resources should be pooled

3
and serve multiple consumers by reassigning virtual resources dynamically to serve
different consumer demands. The consumers should not have any direct knowledge about
the location of physical resource except at a high abstraction level such as where the
datacenter is located.
• Rapid elasticity, which means that computing capabilities should be elastically provisioned
and released, preferably automatically, so that it scales after the consumer demands, thus
creating a sense of unlimited computing resources.
• Measured service, which means that cloud systems should automatically measure the the
service(bandwidth, storage, processing etc.) and use it to control and optimize the resource
usage. This measuring can also be used for statistics for the consumers as well as for the
provider to calculate the price of the service depending on the consumers usage.

The four deployment models deals with access for the cloud infrastructures, with private, public,
community and hybrid models. The private cloud is deployment that is exclusive for a single
organization while the community cloud a specific community of consumers that share specific
concerns, e.g. security requirements. The public clouds are instead open to the general public
instead of being restricted to specific companies. The hybrid clouds are a composition of at least
two of the previous three deployment models that remain unique entities. However the models are
bound together by proprietary or standardized technology to enable portability for applications
and data which enables techniques like cloud bursting(using computational resources from the
other cloud infrastructure in times of high demand).

Figure 2.1: Image showing the different service models with their different access levels.

The three service models represent different abstraction levels up from the hardware. Starting
with the lowest level of abstraction we have Infrastructure as a Service (IaaS) which is more or
less just providing hardware capabilities with an underlying cloud infrastructure. The consumers
can choose everything from operating system and upwards themselves according to their needs.

4
At the next level of abstraction we have Platform as a Service (PaaS) which takes away certain
control from the users. PaaS is used by users wanting to run their own application, without
keeping it running on their own hardware. The PaaS infrastructure does not let users control any
of the underlying cloud infrastructure which includes operating system, storage, network and
servers. The users only control the deployed applications and configuration settings relating to
the platform environment.

The final service model, with the highest abstraction level is Software as a Service (SaaS) in which
even the application provided as a service. These applications run on the cloud infrastructure
and are available to the users through different clients such as web-clients or mobile application
clients. The only configuration available for the users are specific settings for the application,
leaving no control over the underlying components like operating systems or servers. Some SaaS
systems also provide an API(Application Programming Interface) for communicating with the
service to allow other developers to create their own systems that uses their existing system or
data in some way.

The three different service models can be visualized easily as a different layer of abstractions
where the main difference is the access available as illustrated in figure 2.1. As the figure
illustrates, the service models can easily build upon each other, where each subsequent service
model can be created based on the previous service model. This means that a SaaS cloud can be
run upon a PaaS cloud which in turns run on an IaaS cloud. This however depends on the actual
cloud provider and different providers grant access to the different service models, usually at
different price rates. One example is Microsoft Azure [31] which provides either a PaaS cloud or
an IaaS cloud.

2.2 Testing
Software testing[29] is a practice that can be used during the entire development life-cycle of a
software. Starting from the requirements of the software, tests can be designed before the actual
program that can be used to assess that the software fulfils the specifications[15]. For a released
product testing can be used both to further improve the system by finding previously hidden
defects, but also by verifying that new features and updates does not break previous functionality.

Testing can also used to ensure a certain quality in the developed software[6]. Testing can be
used to find defects in the software and ensure that the software adheres to the specified design
and purpose. The testing of the software can verify whether a system performs as expected, both
during the development phase, but also during refactoring or maintenance. By discovering and
eliminating defects in the software the quality of the system is improved.

2.2.1 Unit Testing


Unit testing is a testing method where small parts of the code, units, are tested individually. In a
system that uses an object oriented language and practises the units are most the different classes
of the system. Unit tests are used to test the functionality of the classes by calling different
functions using specific input data. Due to inter class dependencies in the system a class can not
always be isolated in unit tests. When the unit tests cover bigger and bigger parts of the system
the unit tests themselves generally become more and more complex as well due to the inter class
dependencies[28].

5
2.2.1.1 Unit Testing with Mock-Objects
In 2001 Mackinnon et al. introduced the concept of unit testing with mock objects in the paper
"Endo-Testing: Unit Testing with Mock Objects"[19]. In the paper they describe mock objects as
dummy implementations that emulates the real functionality of the objects that they represent. A
unit testing pattern by Clifton[7] provides a brief description on the usage of mock objects for
unit testing. The pattern makes use of abstract methods, interfaces and the factory pattern to be
able to work with both the real implementation object, but also the mock object implementation.

Thomas and Hunt[37] presents a list of seven reasons for using mock objects, simplified from the
original paper by Mackinnon et al. as seen in the list below.

• The real object has non-deterministic behaviour.


• The real object is difficult to set up.
• The real object has behaviour that is hard to trigger.
• The real object is slow.
• The real object has (or is) a user interface.
• The test needs to ask the real object about how it was used.
• The real object does not exist yet.

However, there is also certain limitations that are associated with testing using mock objects. A
big risk when using mock objects is the validity, or how well the mock can represent the actual
object. Errors that are present in the mock object can both fail tests that should pass, as well as
pass tests that should fail leaving errors that might have been caught when testing with the actual
object.

The procedure for unit testing using mock objects described by Mackinnon et al.[19] can be
described in a simple step-by-step format.

• Create instances
• Set the states in the mock objects (Setup preconditions)
• Set the expectations for the mock objects (Setup expected results)
• Call the domain code, using the mock objects (Run the actual test case)
• Verify the mock-objects (Verify the results with expected results, Asserts)

2.2.2 FITTEST Research Project


Between 2010 and 2013 there was a research project funded by the European Commission lead
by Dr. Tanja E. J. Vos from Universidad Politécnica de Valencia. This research project was
called Future Internet Testing (FITTEST) and had a focus on developing a test suite for Future
Internet applications [11]. When testing Future Internet applications there are several challenges
compared to testing a regular application [39]. The challenges that the researchers identified
with testing the Future Internet applications are described in the list below.

• Self Modification and Autonomic Behaviour: Many Future Internet applications make use
of Service Level Agreements and dynamically loaded components to better be able to adapt

6
to different use scenarios. Together with the autonomous behaviour of Future Internet
applications which causes them to be hard to properly define during the design-phase, there
is a greater need for more testing, in the case of the FITTEST project they complement
with Continuous Testing.
• Asynchronous Interactions: Due to the highly asynchronous nature of Future Internet
applications with many clients accessing the applications and services with multiple
requests which makes the testing require additional concurrency aspects to achieve proper
test coverage.
• Time and Load Dependent Behaviour: Reproducing bugs is made difficult due to the effect
that timing and load conditions have on the applications which can cause specific errors
only during very specific conditions.
• Huge Feature Configuration Space: Future Internet applications usually have a large
amount of options and configurable features and environment details. This causes the
applications to have a larger domain which requires testing.
• Ultra Large Scale: Generally speaking, Future Internet applications consists of systems of
systems, which causes low test coverage for even good test situations due to the inadequacy
of traditional testing criteria for these types of systems.
• Low Observability: The systems that make up the Future Internet applications are
increasingly third party systems or services. These systems and services are often accessed
in a black box fashion which is harder to test compared to in-house systems.

To deal with the identified challenges, the researchers, Vos et al.[40], developed several different
testing techniques catered for Future Internet applications. These techniques were then used in
case studies at four different companies; IBM, Softeam, Sulake and Clavei. Each case study used
a different subset of testing techniques depending of the needs of the company. The techniques
developed for this research project are briefly described in the list below.

• Continuous Testing: For continuous testing they have developed a technique that creates
new test cases based on logs generated by the end-users running the system. By using
the logs they infer a finite state machine as a behaviour model for the System Under Test
(SUT). By traversing the state machine new test cases can be generated. To make sure that
the new test cases are different from the logged executions a combinatoric approach is used
on the execution parameters. This technique uses oracles that are inferred from the same
logs used for the test case generation. Because of this all of the errors discovered needs to
be checked manually to make sure that they actually represent errors.
• Regression Testing: They use audit testing for their test suite as a form of regression
testing. Their goal with this technique is to make sure that new services or new releases of
current services are compliant with the system. Their approach is to perform a test case
prioritization using a technique called Change Sensitivity Test Prioritization which detects
the most important test cases based on mutations of the semantics.
• Combinatorial Testing: This is a technique that designs test cases for a SUT by combining
different input parameters. To improve on this they use a simulated annealing hyperheuristic
search algorithm to improve the results. This is a type of unsupervised machine learning
that learns a combinatorial strategy to improve the test case generation.
• Rogue User Testing: This type of testing is focused on Graphical UIs as a way to automate
the testing of all possible actions that the user can take from the interface provided by the

7
application. The test uses the state of the Graphical user interface as a starting point and
selects and executes an action from all possible actions available in the current state. An
oracle is then used to control the resulting state, saving sequences that produce invalid
states for replay.
• Concurrency Testing: This technique deals with testing the issues that arise from concurrent
users such as data races and deadlocks. The FITTEST project has improved on an existing
testing tool by IBM called IBM Concurrency Testing Tool. It works by inserting noise and
delays into the program and then testing commands using different schedules to identify
concurrency problems and unintended behaviour.

2.3 Test Evaluation


There are many different characteristics that can be used when evaluating different test suites.
You can look at characteristics like cost of generating tests, the time required to run the test,
different measurements that describe to what extent different areas of code are tested and the
number of faults they detect.

2.3.1 Test Coverage


There exists many different types of measurements that can be evaluated for a test suite [13, 1,
44], however these measurements do not specify on their own how well a program is tested.
Instead of measuring how well a program runs, these measurements are used when comparing
different test suites to each other. When combining the measurements with the results from the
tests, as errors and defects found, it can be used as a base for comparing test suites. In the list
below three different coverage measurements are described.

• Code coverage measures the percentage of the code that is covered by tests. Block coverage
and Statement coverage are two coverage measurements with the same focus. Block
coverage measures in terms of blocks which are sequential statements without any outward
or inward flow of control, while code coverage and statement coverage measures in terms
of statements(rows of code). Code and Block coverage are two common coverage criteria
used by many researchers [13, 1] when comparing test suites or coverage criteria. Full code
or block coverage is achieved when all blocks or statements in the program are executed at
least once by the tests in the test suite.
• Branch coverage, also known as decision coverage, is another common coverage criterion[13,
1, 44] which instead of dealing with the programs total statement percentage, like code
coverage, measures how many branches that have been executed. It measures how many
branches that have been traversed by the tests, that is where a decision can evaluate to
either true or false and take different paths depending on the result. As such complete
branch coverage is achieved when all of the branches have been executed at least once by
the test suite. This means that all points of control flow in the program must evaluate to
both true and false for complete coverage.
• Predicate coverage, also known as Condition coverage[13], can be seen as a more thorough
version of branch coverage. Instead of looking at only the outcome of a decision, as is done
in branch coverage, predicate coverage looks at every individual condition. This leads to
complete predicate coverage requiring that every single predicate, or condition, in all of
the program decisions need to evaluate to both true and false during the execution of the
test suite.

8
2.3.2 Fault Detection
When evaluating and comparing different test suites, a measurement on how many actual faults or
defects is good to take into consideration. A way to do this is by creating mutants of the program.
A mutant is a program with a planted fault[45]. If a test suit detects a fault when running the
mutated program, by having a different result on at least on test, the mutant is considered to be
dead or killed. Test suites can then be evaluated and compared based on how many mutants the
respective test suites manages to kill. By taking the percentage of the killed mutants compared to
the mutants created it can be used as a measurement for mutation adequacy[45].

9
3 METHOD - DESIGN

A case study is used to investigate contemporary phenomena in a specific context while making
use of multiple sources of evidence[33]. Because of these characteristics a case study, using the
guidelines provided by Runeson and Höst, was chosen as the base of the scientific method for
this thesis. The case used in the case study was provided by Cybercom, an external company that
needed a system implemented. The selection of the case can be seen as a revelatory study. This
thesis makes use of different sources of data to answer the research questions, which is a defining
criteria of a case study.

The data sources for the case study are the implementation of the integration system, which is
used as the actual case, a literature study, field observations and an interview with a developer at
Cybercom. The implementation part includes both the actual implementation of the system but
also the testing, coverage and fault detection measurements performed on the finished system.
The data from the different sources is compared and combined to produce the results.

To get the results for RQ1 the literature, documentation for cloud services and observations/inter-
view at Cybercom is used to identify the differences. When the differences are identified the
results from RQ1 is used as a base to get results for RQ2, by looking at how these differences are
affected by the use of a mock-object. Observations are done during testing to see if there are
additional experiences or challenges with using and implementing a mock-object for the SaaS
cloud. Lastly, to answer RQ3, test coverage and fault detection is analysed using the developed
mock-object.

Using experiment as the scientific method was evaluated, however in the scope of the entire
thesis a case study was deemed to be the better choice. The study makes use of both quantitative
and qualitative data collection, while an experiment would look only at quantitative[42]. These
qualitative aspects that are used make a case study a better choice than experiment for this thesis.

Action research was also evaluated as a possible methodology. While action research uses
participant observation and interviews as key data collection[17]. While both participant
observation as well as interview and field observations are used as for data collection, a big focus
of this research makes use of both actual implementation and testing as well as measuring the
differences between test suites. As such it is a better match to use a case study as the methodology
for this project.

The different data collection parts that are used for this study are described in greater detail in the
sections below.

3.1 Literature Study


A literature study is a necessary part of the project both to determine the gap and which testing
approaches are relevant for testing applications using data in the cloud but also to provide a solid
theoretical framework for the project.

The literature study is used to derive the theoretical framework and also as a base for answering
the first research question. The relevant scientific literature is used in combination with an
interview of an employee of Cybercom Karlskrona and technical documentation of SaaS APIs as

11
a base for mapping out the differences between testing and application using SaaS or databases.
This mapping is derived from a search of differences between cloud systems, with a specific
focus on SaaS clouds, and databases.

3.2 Interview and Observations


Since the study is done for a company with experience in implementing and testing similar
systems, developers at the company can give insight into the subject and what is common practice.
Throughout the study, observations can be done.

Talking with developers that are knowledgeable about the subject adds a good base for knowing
about the relevance of the subject, if the current way of testing is in need of improvement, and
also what is common practice when testing systems with data in a SaaS cloud.

In addition to observations made by informal meetings a more formal semi-structured interview


is planned[8]. This kind of interview uses a set of prepared open questions so that the interviewed
person gets a clear topic but can stray a bit so that unexpected perspectives may appear.

3.3 Testing
To see what difficulties and experiences there are when testing an application that uses data in
a SaaS cloud, and to what extent a mock-object can be used for testing, a system with these
properties is used. In order to compare the differences directly, a Mock-Object Pattern[7] which
uses a database as the base for the mock-object can be used.

After implementing the mock-object, the system is tested using a test suite, which is run using
both the mock-object connection and the real SaaS connection. This test suite is used to evaluate
the usage of a mock-object using a database for replacing SaaS during testing. Tests that can not
use both the mock-object and SaaS connection is not part of the test suite used for measurements
on usage of mock-object compared to SaaS. Some tests like these is however created to evaluate
the challenges and experiences gained by testing with a mock-object and a SaaS.

3.4 Test Evaluation


The evaluation of the testing done on the system is done in two parts, coverage measurements
and fault detection.

3.4.1 Test Coverage


For coverage measurements three different types of coverage are measured, the coverages that are
evaluated are:

• Statement coverage
• Branch coverage
• Condition coverage.

The planned process for collecting the coverage data is described in figure 3.1.

12
The three coverage measurements are chosen as they represent which parts of the codes that
are exercised when running the unit tests. Both the Statement and Branch coverage provides
a good overview of what parts of the code that are completely uncovered by tests, which can
help in generating new test cases that exercise the uncovered code. The Condition coverage
measurement, while similar to Branch Coverage, is chosen because it represents more specific
paths in system. The Condition Coverage will, when used in relation to the Branch Coverage, be
able to show where branches are not triggered by certain flags. This can point towards specific
cases which are missed by the unit tests. While coverages does not give any indication of whether
the unit tests performed are of a good quality or not, they can still provide useful information to
if the test runs differ.

If the coverage results differ when running the test suites with different connections, it can
point to different things. If the combined coverage is greater than the other two measurements,
different code is executed when running the tests with the two different connections. This can
potentially point towards validity concerns in the mock-object, since the results returned from the
mock-object could be different from what is returned by the real SaaS object. The same can be
said for when the coverages differ between the mock object and the real object. If the coverages
are different it points to inconsistencies between the mock object and the real object. While
the mock-object should be a simplified implementation of the real object, the key functionality
should still stay the same to gain the ability to use it for testing. So having the same coverage for
the two connection types is the desired outcome, since it points to the connection types running
same code.

Figure 3.1: Design of how the coverage measurements are taken. The code base will be
tested by running two different test suites, as well as running them together. Coverage
measurement tools are then used to obtain the coverage measurements for the two test
suites as well as the measurements when combining the test suites.

3.4.2 Mutation Testing


When measuring the number of faults each test suite can find the process in figure 3.2 is used.
The system is injected with synthetic faults, creating mutated program versions called mutants.
These mutant programs are then tested using the test suite, counting the number of mutants that
are successfully eliminated(detected by the tests). The results from this process are presented
as percentages of the total number of mutants that each test suite are able to eliminate, called
mutation score.

In the case of the mutation score being different between the two connection types, it can happen
in two different ways. On one hand, if the real object manages to eliminate mutations that the
mock-object does not, it points towards mock-objects not being able to comfortably replace SaaS
for testing purposes. When this happens it can be concluded that the mock-object can not be
used to replace the real object for testing as the test suite using the mock-connection is not able to
properly identify faults in the code.

13
On the other hand, if the mock-object instead manages to eliminate mutations that the real SaaS
object does not it becomes trickier. These faults then need to be examined, since they could
represent actual errors in the code. But even if they represent an actual error, the question of
whether it is a relevant error comes into question. If the error can not be found when using the
SaaS, is it an actual problem for the system? Fixing errors than can not happen during production
could be seen as a waste of resources.

The goal is of course to have the mutation score the same when using the mock-object connection
and the real SaaS connection, as it points towards the mock-object being able to replace the SaaS
object for testing purposes.

Figure 3.2: Design of how the number of faults detected are measured. The integration
system receives planted faults creating a set of mutants. These mutants are then tested
using the test suites to obtain the percentage of eliminated mutants compared to number of
mutants created.

14
4 METHOD - EXECUTION

4.1 Literature Study

The literature study is executed using the reference databases scopus[35] and inspec[14] as well
as some assistance from researchers at BTH.

Since cloud is a concept that is defined in many different ways some precaution is needed to
find relevant information. As noted in section 2, there are a lot of research for testing using
clouds(TaaS) and testing systems that are located on a cloud that is irrelevant for this study. So in
order to avoid them it is needed to filter results by reading titles and abstracts.

FITTEST[39] list of challenges with testing applications in the "Future Internet", which include
SaaS clouds. This is used as a base of identifying the differences between testing an application
using SaaS or databases.

By looking at characteristics that define cloud systems, and SaaS clouds in particular, different
aspects that have an effect on the testing of applications using these systems can be found.
The characteristics that define cloud systems are based in the theory and definitions of cloud
systems. The cloud system definitions and theory can be looked at together with actual technical
documentation of specific cloud systems to provide a set of constraints that affect the way testing
of applications that make use of these systems. These characteristic and constraints are then
compared to how applications that use a database can be tested.

4.2 Interview and Observations

The company that the project is done for, Cybercom, have a Google integration consultant with
experience in developing software that uses data in SaaS clouds as well as databases. Because
of that, a person with knowledge on the subject was easily accessible for the semi-structured
interview.

The interview, Appendix A, is conducted at the beginning of the project, taking place at Cybercom
Karlskrona using questions that have been prepared before the interview. The interview is made
face to face, while using a recording device to record the audio of the interview to avoid
transcribing during the interview. The interview is then manually transcribed into text at a later
date.

The questions used for the interview are created with the intent of seeing what problems are
perceived as problems in the eyes of professional developer with experience in the field. The
problems that have been identified by FITTEST as problems and are applicable to systems using
SaaS clouds are used as a base for the questions; as well as letting the questions be open in case
there are other difficulties that were not identified prior to the interview.

As for the observations, regular meetings is held at Cybercom and discussions regarding problems
and solutions is discussed with developers.

15
4.3 System Under Test
The system that is used for testing is an integration system between a scheduling software called
Novaschem, by Nova Software, and Google Calendar’s SaaS. An overview of the integration
systems connections can be seen in figure 4.1. The testing of the system is done for the integration
system(Pink box), with the connection being controlled by a factory which instantiates either the
real connection, to Google Calendar, or to a Mock-Object which uses a MySQL database. The
following subsections will describe the system in greater detail.

Figure 4.1: An overview of the System under Test, with the connections mapped out. The
outgoing connection is based on the Mock-Object pattern. The system makes use of a
factory to create the mock-object or SaaS connection depending on the chosen setting.

4.3.1 Google Calendar


The SaaS database that is used for this case study is Google Calendar. Google Calendar is a
free-to-use online calendar service that can be accessed through a normal web browser such as
Internet Explorer or Google Chrome, or it can be accessed through an API which is compatible
with a wide array of programming languages, for example python, java, php, http and C] /.NET.

All possible requests and how to use them can be accessed for the API can be found on the
Google developers webpage[12]. The core functionality provided by the API supports different
methods of modifying the calendars and events. Events can be created, removed or edited for
different calendars owned by users. In a likewise fashion calendars can also be created, edited

16
and removed for users, however the creation of calendars is more restricted with heavier usage
limits.

Google Calendar has several different usage limitations [5] when it comes to accessing the API
as is common when it comes to SaaS clouds. These limitations cover different aspects of the API
and has different effects when they are exceeded. The limitations for Google Calendar can be
seen in table 4.1. As we can see in the table, there is a hard limit for the number of queries or
requests that can be done during a day. There is also a limit preventing too many queries from
being done in a short period of time, in this case 100 seconds. These hard limits prevent any
further requests for the duration the limit covers, either for the rest of the day or for the remainder
of the 100 seconds.

For Google Calendar there are also more specific limitations imposed on the API, as seen in
the bottom rows of table 4.1. These limits concern specific requests or actions that the user can
perform, as a safeguard against either faulty or malicious use. When these specific limitations are
exceeded the SaaS goes into read only mode for the user that exceeds the limitations. The exact
lockout time is not specified in the official documentation but rather it states that the lockout
lasts for several hours. These limitations are above what Google considers normal usage of the
SaaS and as such only possible to be reached when performing administrative tasks using third
party applications that make use of the API. To get a better understanding of the limits, and the
durations which are not specified in the documentation a small test was conducted to help with
deciding the overall control flow of the synchronization.

Type of Request Limit Limit Duration Penalty for exceeding


All Queries* 1’000’000 Per Day Hard limit - requests fail for limit duration
All Queries* 500 Per 100 Seconds Hard limit - requests fail for limit duration
Create Calendar† 25 Short Duration API becomes read only for a few hours
Create Event† 10’000 Short Duration API becomes read only for a few hours
Invite to Event† 100-300 Short Duration API becomes read only for a few hours
Table 4.1: Google Calendar usage limits
* https://developers.google.com/google-apps/calendar/pricing
† https://support.google.com/a/answer/2905486?hl=en
"Short Duration" is the limit that is mentioned in the documentation. Results from a short test to
determine the duration can be seen in section 5.1.3

4.3.2 Integration System


The integration system is made in Microsoft Visual Studio using the C] and .NET, C] and .NET
is used since .NET is the only programming language that is supported by the API of Novaschem.
This integration system is made for the Karlskrona Municipality to synchronize the schedules
for pupils and teachers in the public schools to Google Calendar. Since all pupils and teachers
already use Google Groups and Google Mail they already had Google accounts and with that
Google Calendar, so having the schedules on Google Calendar makes access to them much easier.

The system is made to be modular so that it can be modified to an integration between other
calendar systems if needed. One part of the system is used for getting source data from Novaschem
and parsing that data. If changes to the schedules have been made, these changes are interpreted
into task objects. These tasks contain data that is needed to perform a task such as to create an

17
event. All tasks are then sent in a list to another part of the system that manages the outgoing
connections, either to Google Calendar or the Mock database.

The system is a windows service which means it runs in the background with no UI, settings can
be changed in a config file and output such as errors and information messages can be sent to the
administrator through the windows event logger and email.

4.3.2.1 Source Data Connection


The source data is taken from Novaschem through their API. This source data is received as a
serialized C] file. The data representation is created from an XML-schema using Microsoft’s
XML Schema Definition tool(XSD.exe)[24], which creates C] class representation with the use of
XML-schemas. The source data can then be deserialized into the C] data structure representation
or parsed directly in XML-format.

The source data contains the needed information about the events that are synchronized to Google
Calendar. The data contains information about time, location and participants for the different
events as well as information about course name and event descriptions.

When the source data is imported it is interpreted by the system which generates tasks. These
tasks are designed to contain the necessary information required to synchronize a certain event or
calendar with the SaaS cloud.

4.3.2.2 SaaS
The connection to the SaaS cloud, Google Calendar, is done through an interface. By going
through an interface the functionality that is needed can be defined in one place and the actual
implementation of the functionality can be taken care of in a different place.

This is used in combination with the factory pattern to facilitate the creation of either the SaaS
connection or the mock database connection. The usage of the same interface ensures that both
the SaaS connection and the mock database connection has the same outward functionality.

Because of the limitations on creating calendars and the system needing to be able to manage
thousands of users it is not possible in this case to create a calendar for each class group or user.
Instead every event is added directly into the primary calendar for every participant through
impersonation. Impersonation is using a service account to gain access to a user in a domain and
edit that users calendars. Since the only calendars that are used are the primary calendars the
only needed types of access to the SaaS are create event, delete event and edit event.

4.3.2.3 Mock Object


The mock object for the SaaS cloud is created using a MySQL database. The database is reached
through the same interface as the connection to Google Calendar. The mock database is created
to resemble the functionality of the SaaS, to achieve the same results as when using the SaaS.
However since mock-objects is supposed to be simpler than the real code[19], a completely
similar functionality was neither the goal nor the actual outcome. Instead the focus lies simulating
different states which are then used to test the implementation that makes use of the mock-object.

18
To emulate the functionality of Google Calendar that the integration system makes use of, the
database needs to be able to store events, users and connections between the users and events.
This lets the database have states similar to that of Google Calendar, with a calendar being
represented by all of the events that are connected to the user. In Google Calendar an event is
connected to a specific calendar. However due to the fact that this integration system only makes
use of one calendar per user, the default primary calendar, the mock can be simplified by having
the users directly connected to events instead of multiple calendars without losing any form of
functionality.

The states that are possible in Google Calendar are not only connected to the actual data that
resides in the cloud. There are some states that are triggered when specific sequences of requests
are done, e.g. when performing more than 500 requests in a 100 second time-frame. When this
happens it enters a lockout state in which further requests are denied for the remainder of the
duration. These states are added to the mock database together with manual trigger functions.

To handle the data validation, like id-format checking, a C] layer between the system and the
database is added. This layer handles the validation of parameters sent to the mock-object. It
could have been implemented solely in the database, but it is easier to perform these checks in
C] than performing them directly in the database. The logic for this validation layer is based on
the documentation of the Saas(Google Calendar) to keep the mock-object as valid as possible.

The one thing that the mock-object does not support is the account authentication process.
The login parts on their own could have been handled since Google Calendar makes use of
OAuth2.0[27]. However the permissions used by Google, with service accounts and domain
authorization[36] can not be added for the mock-object as well as the actual accounts. A
completely fake authorization and permission system could have been added, but it would make
the mock-object significantly more complex, and would not provide much for testing of the
system.

The mock-object is tested manually, performing actions, and the respective action for the real
SaaS object. The resulting data is then checked for the mock-object and compared to the results
gained from performing the actions against the SaaS. The same is done for performing "faulty
action", or actions where the SaaS would return error codes. Both the mock-object and the SaaS
are subjected to faulty actions and the errors returned from the SaaS are compared against the
result from the mock-object.

4.3.3 Limitations Test


Since the case study is based on the system described in chapter 4.3 which uses Google Calendar
as the SaaS cloud and the documentation on the query limitations is unspecified in some of
the cases. A small test was designed to gather more detailed information about the limit that
looked like it would prove to be a problem are for both the SUT and the testing of the system.
The limitation in question, from table 4.1, max 25 calendars can be created within a "Short
Duration" or the user goes into read-only mode for several hours. The documentation about this
limit is rather brief and does not state either what a short duration means or how many hours the
read-only mode lasts for.

The test tries to determine the "Short Duration" mentioned in the documentation. This is done by
allowing an application to create 30 calendars with a delay between each calendar created. By

19
varying the delay and looking at how many calendars that are able to be created before the API
goes into read-only mode. This provides a baseline for how long the "Short Duration" needs to
be to avoid the read-only penalty.

4.4 Testing

Testing of the system was done using unit testing, written in the same language as the system, C]
and the Visual Studio unit test framework MSTest. A test suite is created for the system where the
connection type is used as a setting to allow the suite to be executed using either the mock-object
or the real SaaS object. The test suite contains 30 separate unit test cases which are implemented
by us.

The test cases are written to cover the desired functionality of the system representing the
requirements of the system. These use cases taken from the requirements are then used to create
tests in which the functionality is tested. Example of a use case is the addition of a new activity
in the source data, this test can be seen in appendix B.2. In this test a new activity has been added
to the source data, and the test check that new events has been created, as well as checking the
data of the added events.

The data used for these tests are taken from the test server provided by Nova Software stored
locally as XML-files. This test data uses the same format as the real data, with the exception
being that it is fake users in the data as to adhere to the Personal Records Act[30]. This data
has then been altered slightly, so that the fake users in the data can be mapped against the test
accounts available for Google Calendar.

The test suite also contains unit tests which focuses the systems handling of incorrect requests
made to the mock-object/SaaS connection, and the resulting behaviour of the synchronization
queue. An example of this kind of test can be seen in appendix B.1. In this test the functionality
creating an identical event is tested, checking how the queue and logging handles this specific
case. These test cases are of the shorter and more simpler type. The test cases that are created
from the system requirements are of a different type; using more advanced setup, testing bigger
or more complex scenarios.

In addition to the created test suite, there are some standalone tests that are used to test the
integration system. These tests are not used when evaluating the difference between using a
mock-object and a SaaS. These are the types of tests that can not be run in the test suite for both
the mock-object and the SaaS. One example of this kind of test is the test listed in appendix C,
which tests the functionality of the system when the Daily Limit is triggered. This test can be run
easily with the mock-object since the limit can be manually triggered and reset. If it had to be
done for the SaaS the test would take a very long time to execute as well as prohibit all further
access to the SaaS for the remainder of the day.

These standalone tests represent the types of tests that are lacking in the test suite. While these
tests still help to test the system, as well as evaluating the challenges with testing an application
using SaaS or mock-object database, they are not used when measuring the coverage or mutation
score using the two different connection types.

20
4.5 Test Evaluation
The test evaluation is done in two parts, measuring the coverage and evaluating fault detection
using mutations. The evaluation is performed for both of the test suites described in section 4.4
above. The coverage measurements are also taken for the combined test suites, measuring how
much the two test suites cover together. In addition to this it could be noted that the execution
times for the suit varied a lot between using the SaaS and using the mock-object so a test to see
the difference is done by running the test suit ten times for each set-up and calculating the average
test execution times. The set-ups tested is against the SaaS, running the tests with the mock-object
on an external computer within the same city and running the tests with the mock-object on the
same computer(localhost).

4.5.1 Test Coverage


The coverage measurements are measured using third party tools. For Statement Coverage
dotCover by Jetbrains[10] is used, however it does not support any other type of coverage
measurements, so another tool is needed as well. To measure Branch and Condition coverage the
tool NCover[25] is used.

When measuring the Branch and Condition coverage the process described in figure 3.1 is used.
The NCover tool automatically collects coverage during the execution of the test suites. The
coverage measurements are presented as both statistically with percentages but also with a visual
representation in the editor which allows the developers to see directly what code is covered.
Both test suites are first executed separately to generate the coverage for the individual test suites,
and then executed together to get the coverage of the combined test suites.

The Statement coverage is measured in a similar fashion to the other coverage measurements,
but using dotCover instead. Like the NCover tool, dotCover also provides both the actual
measurement data but also a visual representation in form of highlighting in the editor.

4.5.2 Fault Detection


When evaluating the fault detection for the test suites the process described in figure 3.1 is
used. This is done with the help of an external tool called VisualMutator[38]. By using the
VisualMutator tool, mutation operators are applied to the original program injecting faults to the
source code.

The mutation operators that are used for this evaluation are all of the "Standard Operators" which
are included for the VisualMutator tool and can be seen in the list below.

• AOR - Arithmetic Operator Replacement


• SOR - Shift Operator Replacement
• LCR - Logical Connector Replacement
• LOR - Logical Operator Replacement
• ROR - Relational Operator Replacement
• OODL - Operator Deletion
• SSDL - Statement Block Deletion

21
For each mutant created by the tool, the test suite is then run on the mutated program identifying
if a mutation is killed or left alive by the test suite. If all of the tests in the test suite passes
the mutation is considered alive. If the test suite instead has at least one test which fails when
running the test suite the mutation is considered to be killed.

The total number of mutants generated by the tool and used for this evaluation is 281. These
mutants are evaluated using the test suited described in section 4.4, consisting of 30 unit tests.
The evaluation is done two times, once with the test suite using the mock-object and once using
the real SaaS object.

4.6 Validity Threats


Due to the fact that the tests against the SaaS are using a public cloud, the executions of the tests
can be affected by uncontrollable factors. These factors include for example unexpected backend
errors in the SaaS or unexpected downtime of the service. To alleviate these problems both the
mutation testing and the coverage measurements are taken multiple times, during different days,
to ensure that the same results are received and no unexpected factors affect the result.

The first plan for performing the mutation testing was to have manually inserted faults into the
code. Because the test suite was created first, the mutations would suffer from bias if they were
to be manually created and seeded by the same people who created the test suite. To get around
this problem it was instead decided to use an external tool to generate the mutants, as described
in section 4.5.2. By using a tool the bias that would be present with manually injected faults is
avoided which would have presented a very large threat to the validity of the results.

22
5 RESULTS

5.1 Research Question 1


When it comes to testing, what are the main differences between an application using SaaS
instead of databases for data storage?

Results summary: When it comes to testing, the three main differences are lack of envi-
ronmental control, lack of transparency and stricter usage limitations.

List of challenges with testing of Future Internet Applications, according to FITTEST[39], and if
it is applicable when it comes to systems with data in a SaaS cloud:

• Self Modification/Autonomic Behaviour - A SaaS service is not controlled by the developers


and can be changed at any time but for the commercially available SaaS services this
should not happen frequently and if the functionality of the service changes it affects more
than just the testing.
• Asynchronous Interactions - Whether the system is threaded or not is not affected by where
the data is stored.
• Time and Load Dependent Behaviour - Since a SaaS is not controlled by the developers,
the state it is in is not fully configurable and this could impact testing.
• Huge Feature Configuration Space - The size of the domain is not affected by where the
data is stored.
• Ultra Large Scale - The scale of the SUT is not affected by where the data is stored.
• Low Observability - This is a defining point of a SaaS cloud.

Looking at this it is possible to combine Self Modification/Autonomous Behaviour and Time and
Load Dependent Behaviour into a more general term, lack of environmental control, since that is
what creates those challenges.

In addition to that, another difference to databases can be seen by looking at documentation of


SaaS services. That is that there are a lot of limitations for queries, and special limitations such
as a specific amount of calls of a specific function in a day.

5.1.1 Lack of environmental control


Summary: Limited access to functionality makes it harder to clean up after tests and to
set up tests for the SaaS to return certain output.

Having to access a database through an UI or API may naturally lessen the control over it to
some degree. This can be explained as having full control over the database compared to only
having access to a set of stored procedures provided for the database. This can be likened to a
SaaS system, which by its design is accessed through either an API and/or an UI, as described in
section 2.1.

When testing an application that only has access to the data through these APIs the lack of control
makes the testing harder, as the developers works in a higher abstraction layer with less control.

23
When you create new data in a database for testing, for example adding new events, a reversal
back to the database’s original state can be reached through a simple roll-back of the database.
When working with a SaaS, like Google Calendar[12], all access must go through the API. This
means that to revert back to the original state from performing a test like with the database where
you add some form of data, each query made must have an additional query which performs the
inverse action of the original tested query. In the example of Google Calendar, if you test the
functionality of a programs event creation, to revert back to the original state from before the
test, each event created must be removed with an additional deletion query instead of performing
a simple roll-back. A single deletion query is not a problem but in large quantities it can be.
Something could go wrong for the deletion as well, disabling the clean-up; a roll-back is instead
guaranteed to succeed.

By having access to the environment when testing it is also possible to control the state of the
connecting system or database. This can be used when testing by manually triggering certain
states in which the system behaves differently. By having the states manually triggered the
preconditions for tests can be set instantly instead of replicating the steps which triggers a certain
state. The steps required to trigger a state can be both more time consuming, but also have longer
effects on subsequent tests if there is no quick way to revert to the original state.

5.1.2 Lack of transparency


Summary: Transparency makes defect analysis and removal easier and gives the individual
tester a more comfortable situation.

Working with a SaaS means working with a black box i.e. there is only access to the API/UI so
that the underlying functionality is hidden. Not knowing more than what to input for a query and
the expected output could pose a problem as a developer in theory. For example in a function
with full insight it is often possible to see where in the code it turns out wrong if the output is not
as expected; but if data is sent into a black box and it returns an unexpected output it is much
harder to know when and where the problem occurred.

The employee interviewed said, Appendix A, that he does not see this as a problem since he
expects that Google’s services are stable and works as intended; but, also uncomfortable not to
have full control.

FITTEST[39] mentions lack of transparency as a problem but does not motivate it further.

5.1.3 Query limitations


Summary: Limitations in amount and speed of queries slows down testing and can lock
out access for other tests, clean-up and development.

When it comes to commercial cloud services, most services have set usage limitations which
can either be a restriction on virtual resources, hardware or actual limitations on requests that
can be made to the service. These limits vary from each different cloud service provider. In the
list below three big SaaS providers, Google(with focus on the calendar service), Amazon Web
Services (AWS) and Microsoft, are presented with some of the limits they put on their services.

24
• Google - The official limits can be seen in Table 4.1. The first two rows are limits that
applies to all Google services and the last three rows are special limits for Google Calendar.
After some testing it could be noted that these limitations are not exact, the limit on
creation of calendars was tested with varying sleep times between calendar creations. The
results of this test that can be seen in Table 5.1 shows that the limit rather seems to be at
37 calendars than 25 and that it is possible to create approximately three calendars per
hour after reaching the limit. The sleep time also does not seem to affect the amount of
calendars that can be created so the short amount of time mentioned is fairly long and at
least 37 ∗ 120seconds = 74minutes.

Sleep time Calendars created Wait time Calendars created


10 seconds 37 45 minutes 2
30 seconds 37 1 hour 3
60 seconds 37 2 hours 5/6
120 seconds 37 3 hours 8
300 seconds 37 6 hours 15
Table 5.1: Limitations test results - Sleep time is the time in seconds until the next calendar is
created, Calendars created is how many calendars that could be created before going into read
only mode and Wait time is the time after reaching the limit until the next try. Sleep time for all
tests that was done after going into read only mode was done with a sleep time of 10 seconds.

• Microsoft - Microsoft provides several different SaaS options, like Office 365 and Onedrive.
The Onedrive service can be accessed either through the standard user interfaces or through
an API. When accessing through the API there are throttling limitations, similar to when
using the Google SaaS APIs. Microsoft however does not provide any specific numbers on
what the throttling limits actually are since they reserve the right to change the limits at
any given time[23].
• Amazon Web Services - Amazon have a long list of limitations[2] depending on which of
their services that is used; a lot of them not relevant for this paper since they do not only
provide SaaS. The limits are mostly defined in requests per second or maximum numbers,
e.g. users, databases. Some limits can be requested to be increased.

This can be compared to the use of databases, where the only limitation is the actual hardware,
which puts a limit on the processing power as well as bandwidth limitations of the infrastructure.
For some SaaS providers this can be the case as well, for example some of AWS services, but
in general there are more specific limitations for SaaS services and the limitations can be for
specific queries as well. Observations at Cybercom showed that working against a SaaS cloud
makes for slow execution times for queries which the employees think can be explained by built
in slowdowns by the provider.

Having limitations like these may naturally slow down the testing process if a lot of tests needs to
be run. If a query amount limit is reached this may also lock out tests from finishing, running
new tests and even other developers from running their code at times.

25
5.2 Research Question 2
What are the challenges and experiences when unit testing against a mock-object compared to
unit testing against the actual API when testing a cloud integration application?

Results summary: A mock-object can be used to alleviate the problems of lack of environ-
mental control and query limitations but lack of transparency is a problem when designing
the mock-object.

Thomas and Hunt[37] made a list, described in section 2.2.1.1, with seven reasons for using
mock-objects to make testing easier. When looking at the testing of the integration system
described in this paper the following reasons are found to be true for the SUT.

• The real object has behaviour that is hard to trigger.


• The real object is slow.
• The real object has (or is) a user interface

Out of these three reasons, only the first two have an actual impact on the testing of the system.
The third reason, "The real object has (or is) a user interface", does not effect testing for this
system since the user interface of the SaaS, or real object, is not used at all by the system.

The other four reasons, listed below, given by Thomas and Hunt are not found to be relevant for
this system.

• The real object has non-deterministic behaviour.


• The real object is difficult to set up.
• The test needs to ask the real object about how it was used.
• The real object does not exist yet.

When testing the integration system, described in section 4.3, different aspects are found in which
the testing differed when using a mock-object. Some aspects are more challenging when making
use of a mock-object, while it makes other aspects easier.

Triggered States - The generation of tests for testing behaviours occurring during specific triggers,
such as limitations lockout and network errors.

As noted in section 5.1.1, not having full access to the data can make it harder to trigger certain
states such as e.g., the rate limit for Google Calendar, when testing with data in a SaaS cloud.
This makes a big difference when testing the behaviour of the system that is occurring when
these states are triggered. During the testing of the integration system it was found that using a
mock-object made for easier testing of the behaviour occurring if the SaaS reach certain states.
Testing using a mock-object can make use of what was also noted in section 5.1.1 for databases,
manually triggering specific states. This is important when testing behaviour resulting from
states that are not possible to trigger by the developer, but can be triggered on errors from the
SaaS provider, like backend errors.

26
In certain cases the actions to trigger a state can also be unfit to carry out during testing. When
considering the usage limits in Google Calendar, creating 10’000 events, it is not feasible when
working against live servers to use actual test accounts. In these cases a state can not be triggered
in a safe and workable way, which again is avoided using a mock-object.

The time needed to test how the system handles certain output is much greater if the test involves
actually reaching the limit. To try this a test was made to try and reach the rate limit for Google
Calendar; which is 500 requests in 100 seconds. Since a request to Google Calendar takes some
time to execute the limit takes at least one minute to reach. After reaching the limit; the limit will
block other requests, stalling the clean-up for the tests and the next-coming tests. If a daily limit
is reached, other testing and to some extent even development is stalled until the next day.

Data Validation - The generation of tests that check for data validity faults and the resulting
behaviour.

Mackinnon et.al. say that "[t]he most difficult aspect is usually the discovery of values and
structures for parameters that are passed into the domain code"[19] when it comes to mocking.
The more complex the system that is mocked is, the harder it is to guarantee that the mock
behaves in the same way as the real object does to certain input.

During the implementation and testing of the SUT it could also be noted that this is in practice
the biggest disadvantage to testing with a mock-object. Testing how the mock-object handles
faulty input is not useful to seeing how sturdy the system is and testing how the SaaS handles
faulty input is not possible to do with a mock-object. However, testing how the system handles
the output from the mock/SaaS, sending faulty input is useful. Having error handling that works
exactly as the SaaS may need to be very detailed and cumbersome to implement; and it is mostly
not worth the effort as long as the mock is able to return all output that is needed for testing the
system. In the case of Google Calendar, for a lot of faulty input it returns the error "BadRequest".
Implementing error handling in the mock-object for returning a "BadRequest" for only a single
scenario is a way of creating a simpler mock-object that is able to return the error code but is not
able to handle all input to create that result.

Test Clean-up - Bringing the object back to it’s original state after performing tests.

In addition to problems with data validation, lack of environmental control brings one more
difficulty when testing a SaaS cloud. As said in section 5.1.1 the other problem is test clean-up.
A mock-object have a big advantage over testing using the SaaS because of the roll-back feature
that none of the SaaS providers that were examined provided the user with.

Test Execution Times

During testing of the integration system a big difference in execution times can be seen depending
on which object the tests are run against. Running the test suite against the SaaS took on average
99.05 seconds. Running the test suite against the mock-object located externally within the same
city took on average 39.80 seconds which is about 2.5 times faster than against the SaaS. Running
the tests against the mock-object located on the same machine(localhost) took on average 5.76
seconds which is about 17.2 times faster than running against the SaaS. These results will differ
depending on what SaaS provider is used and where the mock-object is located and how it is
implemented.

27
5.3 Research Question 3
RQ3.1: Given the proposed integration system, will the test coverage achieved differ between
testing using the SaaS and using the mock-object?

Results Summary: No differences in coverage was found.

Figure 5.1: Graph displaying the three different coverage measurements, for both the
SaaS connection and the mock-object connection.

As can be seen in figure 5.1 the coverage did not differ at all between testing using the SaaS and
testing using the mock-object.

RQ3.2: Given the proposed integration system, will the found defects differ between testing using
the SaaS and using the mock-object?

Results Summary: No differences in mutation score was found.

Connection Type Mutants Alive Mutants Eliminated Mutation Score


SaaS 59 222 79%
Mock-Object 59 222 79%
Table 5.2: Table showing the results of the mutation testing described in section 4.5.2. 281 total
mutants were injected into the source code. The test suite contained 30 test cases.

The same results could be found for the mutations, table 5.2. Testing using the SaaS eliminated
the same mutations as testing using the mock-object.

28
6 DISCUSSION

6.1 Coverage Measurements


As can be seen in figure 5.1, the coverage measurements for the two different connection types,
mock and SaaS, achieve the same amount of coverage. This is true regardless of the type of
coverage measurement taken. With this data the conclusion that the test suite covers the same
amount of code, regardless of the connection used. By looking at the coverage for the combined
bar, in which the coverage is measured using the test suite with both connections, in separate
executions, it can be seen that the coverage is not just the same amount, but covers the same
actual code. If the coverage differed for the two connection types, the coverage for the combined
measuring would increase.

While the graph, figure 5.1, shows that the test suite executes the same code in the integration
system regardless of connection type. It does not however show anything other than that the code
has actually been executed when running the test suite. This is one of the biggest problems with
test coverage measurements, as stated in section 2.3.1, test coverage can not be used to measure
how well a program is tested. When the coverages stay the same for both connection types, it
does not say much, other than confirming that the same code is executed, which can be seen as a
goal for the mock-object connection, in being able to replace the usage of the real SaaS object for
testing.

6.2 Mutation Score


When looking at how the two different connection types actually performed, the ability to detect
faults in the code is a much better measurement than using test coverage since it gives actual
faults found, instead of just code that has been executed. Since the tests make sure that the data is
valid after the execution is finished, actually checking whether faults are detected by the test suite
is a better measurement. This is looked at by the fault detection, or mutation testing, described in
section 4.5.2.

The results of the mutation testing, presented in table 5.2, show that both of the connection types
performed the same for the test suite. This means that the same amounts of faults is found by the
test suite regardless of which connection type that is used. Since, as stated in section 5.3, it is the
same mutations that has been eliminated, and not just the same mutation score, it can be seen
that the test suite works as well at finding when using the mock-object as when using the real
SaaS object.

By having the same mutation score for the two connection types it is shown that the mock-object
works for replacing the real object for testing purposes. Combined with the results gained
from the coverage measurements, it is also shown that the exact same code is executed for both
connection types, further strengthening this conclusion. When using a mock-object for testing
there are several advantages which are presented in the following section.

6.3 Advantages
As noted in section 5.2 the main advantages with testing using mock-object instead of the SaaS
is to alleviate problems with lack of environmental control and query limitations as well as a

29
possible execution time decrease.

If it is possible to have the mock-object located on the same machine as the testing is run on,
the execution times for tests compared to using the SaaS for testing can be greatly decreased.
But, even having it on a different location can create a speed-up, for this project the mock-object
executed the tests over 2 times faster with the database in the same city and 17 times faster on
the same machine. For a system with a lot of developers wanting to access the SaaS or a lot of
continuous testing a speed-up can be a big factor in favour of the mock-object.

However this speed-up could provide a problem for the mock-object as it does not perform the
same as the real object, in terms of execution times. The delay introduced by the SaaS could
in theory affect certain tests. Running the test suite using the real SaaS object or with artificial
delays that mimic the delay of the SaaS can catch possible faults that are caused by this. This
could be done once in a while to ensure stability while still maintaining the advantage of the
original speed-up.

6.3.1 Lack of environmental control


Lack of environmental control is the main difference between a SaaS and a database or PaaS. For
testing it comes with problems, the most notable ones are returning specific output from the SaaS,
reaching certain states in the database and clean-up. Lack of environmental control is an area
where creating a mock of the SaaS can come into good use since full access to the data is gained.

Having full access makes it possible to send any output from the mock-object which is handy for
testing the system’s behaviour for output that is possible to be returned from the SaaS but hard or
time-consuming to trigger, e.g. "BackendError" for Google Calendar, query limitations or edge
cases like int.MaxV alue. Some caution should be taken when it comes to that though, if output
is hard to trigger in a test it may also never trigger during real execution of the system making
writing a test for that case an unnecessary waste of development time.

The mock-object is also a good tool for handling the clean-up as mentioned in section 5.2. This
is not always a necessary part of testing but in some cases tests are designed in a way that they
require a clean or specifically set up environment which may be hard or even impossible to do in
a SaaS. If such an environment is needed a mock-object is a good way of achieving it. Other
times clean-up can be important, is if there are no dedicated test users and it is unwanted to
clutter the data for real users.

6.3.2 Query limitations


Query limitations is one characteristic that sets many SaaS apart from databases. While databases
can also have different throttling limits, SaaS usually presents one more step of limitations,
having both a maximum query limitation but also having stricter limitations for specific actions.

When it comes to testing, query limits like these are normally not low enough so that it impacts
the amounts of tests that can be run in a smaller system like the one used. However in a larger
system with more developers or if the limits are very low it could happen. If these limits are
reached when testing this could be a problem if queries fail or are slowed down. As a practical
example, when using the Google Calendar API, after a certain amount of requests has been made,

30
as can be seen in table 4.1, all requests are blocked. This lockout could last up to a day, in which
no further executions can be done, which will stop both testing, clean-up and development.

6.4 Disadvantages
The only real disadvantage of using a mock-object for testing is the actual development and
maintenance of the mock-object. This development time is an upfront time investment in which
the mock-object in which the mock-object also needs to be validated to perform like to the system
it impersonates. This development time is proportional to how complex the SaaS that should be
mocked. A more complex SaaS is harder and more time consuming to create a mock-object for.
When maintaining the mock-object, it comes mainly to keep the functionality of the mock the
same as that of the SaaS.

The development of the mock-object is also dependent on what information is available about the
SaaS that should be mocked. This problem is related to the lack of transparency, section 5.1.2,
in which the actual development of the mock-object is made more difficult depending on the
transparency of the SaaS. The more extensive documentation that is available for the system that
should be mocked, the easier it becomes. If the developers needs to manually find out how the
SaaS handles specific edge cases and what they are, the time required to develop and validate the
mock-object increases drastically. This is also a problem with cases where the documentation
does not match the reality. Like described in section 5.1.3, where the SaaS limitations, listed in
table 4.1, does not correspond to the limitations that are present when testing the SaaS, as listed
in table 5.1.

6.5 When to use a mock-object


When deciding if a mock-object should be used for testing, all these advantages and disadvantages
should be taken into consideration. The complexity of implementing a mock of the SaaS versus
the utility of the extra environmental control. The initial cost is a bit higher but simpler and faster
testing will return the value over time.

A mock-object is also very useful if it is not possible to test against the SaaS, for example if the
SaaS does not exist or can not be accessed yet.

6.6 Sustainability
The relevant sustainability aspects of this paper comes from the usage of cloud computing. By
researching more efficient ways to test applications that make use of SaaS, more applications could
in theory make use of SaaS. Since cloud computing make heavy use of both resource pooling and
delivering measured services, as described in section 2.1. This means that the resources will be
shared between different costumers leading to better environmental sustainability[9, 41, 21, 20].

Looking at the economic sustainability there are two interesting factors to look at, the first factor
is connected to the same one discussed for the environmental sustainability, usage of cloud
computing. By making use of cloud computing a company can limit the spending on servers to
the actual amount used, paying only for the used resources. Having servers that can support peak
activity is usually a waste of resources if it would be possible to only pay for the actual overall
usage.

31
The other interesting factor for economic sustainability is the usage of the actual mock-object. By
using a mock-object and getting past the limitations for a SaaS, as discussed in 6.3.2, potentially
more development and testing could be performed. This can in turn lead to better economic
growth through development of additional features and increased product quality.

32
7 CONCLUSIONS

This paper looks at differences between testing systems that are using data in a SaaS cloud
instead of a database, how the problems with testing these systems can be remedied by using a
mock-object instead of the SaaS and finally if the test coverage and mutation score would differ
between testing with the mock-object or the SaaS.

The differences for testing between using a database and a SaaS cloud are that for SaaS clouds
there is a lack of environmental control, lack of transparency and stricter usage limitations. Lack
of environmental control makes certain output from the SaaS hard to trigger and makes clean-up
harder. Lack of transparency makes it harder to identify faults and is less comfortable for the
testers. Stricter limitations can slow down testing and can make the SaaS no accessible for testers
and developers.

It is shown that a mock-object is very useful for problems with usage limitations and lack of
environmental control as well being able to provide a speed up for test execution. The test
coverage and mutation score does not differ for this project, meaning that the mock-object is
equally good at finding faults as using the SaaS directly.

The only disadvantages with creating a mock-object are the initial cost of implementing it and
lack of transparency. Lack of transparency is a problem if the SaaS have unexpected behaviour
that makes it hard to recreate the same functionality for the mock-object. A more complex SaaS
makes it harder to recreate all functionality for the SaaS and increases the implementation cost.

In conclusion it was shown that a mock-object can find the same faults as using the real SaaS and
that the mock-object is a very powerful tool for testing when it comes to systems that are using
data in a SaaS cloud.

33
8 RECOMMENDATIONS AND FUTURE WORK

This study could be expanded to examine different SaaS services and testing approaches. Different
SaaS services have different properties and seeing how testing of systems using various SaaS
services differ could be valuable, or seeing what properties are hard to recreate in a mock-object.
This study is done using only unit testing so creating similar studies with other testing approaches
and techniques may give other experiences, such as system tests, regression tests or stress tests.
Looking at if and how mock-objects instead of SaaS clouds are used for testing commercially in
companies can also be interesting to see if there is research done that is not available academically.

The design of the mock-object could be examined as well, this study is using a database but other
solutions deserve to be tried. For example, having the mock-object be entirely based in memory
or write to file. For smaller systems this could be a useful alternative since the implementation
complexity should be lesser.

Something that could be very useful for testing of this kind of system is a best practice guide for
the mock-object. This study shows what a powerful tool the mock-object is but more work could
be done to see how the mock-object is best created for a certain system.

35
REFERENCES

[1] Martijn Adolfsen. “Industrial validation of test coverage quality”. In: (2011).
[2] Amazon Web Service Limitations. 2017. url: http : / / docs . aws . amazon . com /
general/latest/gr/aws_service_limits.html.
[3] Michael Armbrust et al. “A view of cloud computing”. In: Communications of the ACM
53.4 (2010), p. 50. issn: 00010782. doi: 10.1145/1721654.1721672.
[4] M Armbrust et al. “Above the clouds: A Berkeley view of cloud computing”. In: University
of California, Berkeley, Tech. Rep. UCB (2009), pp. 07–013. issn: 00010782. doi:
10.1145/1721654.1721672.
[5] Calendar Limits. url: https://support.google.com/a/answer/2905486?hl=en#.
[6] Poonam Chaudhary and Seema Sangwan. “Software Testing : Affirming Software Quality”.
In: International Journal of Innovations in Engineering and Technology (IJIET) 5.3 (2015),
pp. 378–383.
[7] Marc Clifton. Mock-Object Pattern. 2004. url: https://www.codeproject.com/
Articles/5772/Advanced- Unit- Test- Part- V- Unit- Test- Patterns#Mock-
Object%20Pattern25.
[8] D Cohen and B Crabtree. Qualitative Research Guidelines Project. 2006. url: http:
//www.qualres.org/HomeInte-3595.html.
[9] Konstantinos Domdouzis. “Sustainable Cloud Computing”. In: Green Information Technol-
ogy: A Sustainable Approach (2015), pp. 95–110. issn: 18670202. doi: 10.1016/B978-
0-12-801379-3.00006-1.
[10] dotCover. url: https://www.jetbrains.com/dotcover/.
[11] Fittest Web. url: http://crest.cs.ucl.ac.uk/fittest/project.html.
[12] Google. Google Calendar API. url: https://developers.google.com/google-
apps/calendar/.
[13] Atul Gupta and Pankaj Jalote. “An approach for experimentally evaluating effectiveness
and efficiency of coverage criteria for software testing”. In: International Journal on
Software Tools for Technology Transfer 10.2 (2008), pp. 145–160. issn: 14332779. doi:
10.1007/s10009-007-0059-5.
[14] Inspec. url: http://www.theiet.org/resources/inspec/index.cfm.
[15] Cem Kaner. “What Is a Good Test Case?” In: Software Testing Analysis & Review
Conference (STAR East) (2003), pp. 1–16.
[16] Manveen Kaur. “Testing in the Cloud : New Challenges”. In: (2016), pp. 742–746.
[17] Ned Kock. “Action Research”. In: The Encyclopedia of Human Computer Interaction.
2nd Editio. 2013. Chap. 33. isbn: 9788792964. url: https://www.interaction-
design . org / literature / book / the - encyclopedia - of - human - computer -
interaction-2nd-ed.
[18] Neal Leavitt. “Is cloud computing really ready for prime time?” In: Computer Society
IEEE 42.1 (2009), pp. 15–25. issn: 00189162. doi: 10.1109/MC.2009.20. url: http:
//ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4755149.

37
[19] Tim Mackinnon, Steve Freeman, and Philip Craig. “Endo-Testing : Unit Testing with
Mock Objects”. In: Extreme programming examined (2001), pp. 287–301. url: http:
//citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.23.3214&
rep=rep1&type=pdf.
[20] Alexandros Marinos and Gerard Briscoe. “Community cloud computing”. In: Lecture
Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics) 5931 LNCS (2009), pp. 472–484. issn: 03029743.
doi: 10.1007/978-3-642-10665-1{\_}43.
[21] Dragan S. Markovic et al. “Smart power grid and cloud computing”. In: Renewable and
Sustainable Energy Reviews 24 (2013), pp. 566–577. issn: 13640321. doi: 10.1016/j.
rser.2013.03.068. url: http://dx.doi.org/10.1016/j.rser.2013.03.068.
[22] Peter Mell and Timothy Grance. “The NIST definition of cloud computing”. In: NIST
Special Publication 145 (2011), p. 7. issn: 00845612. doi: 10.1136/emj.2010.096966.
url: http : / / www . mendeley . com / research / the - nist - definition - about -
cloud-computing/.
[23] Microsoft. OneDrive Limits. url: https://social.msdn.microsoft.com/Forums/
en - US / c4aaa90c - dc75 - 441f - 9b30 - f5c45e402ac4 / limitation - on - api -
requestfile-size-limitationresume-upload?forum=onedriveapi.
[24] Microsoft. XSD-Tool. url: https : / / msdn . microsoft . com / en - us / library /
x6c1kb0s(v=vs.110).aspx.
[25] NCover. url: https://www.ncover.com/.
[26] Nova software. Novaschem. url: http://www.novaschem.com/.
[27] Oauth. url: https://oauth.net/2/.
[28] Benny Pasternak, Shmuel Tyszberowicz, and Amiram Yehudai. “GenUTest: A unit test and
mock aspect generation tool”. In: International Journal on Software Tools for Technology
Transfer 11.4 (2009), pp. 273–290. issn: 14332779. doi: 10.1007/s10009-009-0115-4.
[29] Ron Patton. Software Testing, Second Edition. 2nd Editio. Sams, 2005. isbn: 0-672-32798-
8.
[30] PUL. url: https://lagen.nu/1998:204.
[31] Carl Rabeler et al. SQL (PaaS) Database vs. SQL Server in the cloud on VMs (IaaS) |
Microsoft Docs. 2017. url: https://docs.microsoft.com/en- us/azure/sql-
database/sql-database-paas-vs-sql-server-iaas.
[32] Leah Riungu-Kalliosaari, Ossi Taipale, and Kari Smolander. “Testing in the cloud:
Exploring the practice”. In: IEEE Software 29.2 (2012), pp. 46–51. issn: 07407459. doi:
10.1109/MS.2011.132.
[33] Per Runeson and Martin Höst. “Guidelines for conducting and reporting case study research
in software engineering”. In: Empirical Software Engineering 14.2 (2009), pp. 131–164.
issn: 13823256. doi: 10.1007/s10664-008-9102-8.
[34] Dmitry Savchenko, Nikita Ashikhmin, and Gleb Radchenko. “Testing-as-a-service ap-
proach for cloud applications”. In: Proceedings of the 9th International Conference on
Utility and Cloud Computing - UCC ’16 (2016), pp. 428–429. doi: 10.1145/2996890.
3007890. url: http://dl.acm.org/citation.cfm?doid=2996890.3007890.
[35] Scopus. url: https://www.scopus.com/home.uri.

38
[36] Service Account Documentation. 2017. url: https://developers.google.com/
identity/protocols/OAuth2ServiceAccount.
[37] Dave Thomas and Andy Hunt. “Mock objects”. In: IEEE Software 19.3 (2002), pp. 22–24.
issn: 07407459. doi: 10.1109/MS.2002.1003449.
[38] VisualMutator. url: https://visualmutator.github.io/web/.
[39] Tanja E J Vos et al. “Future Internet testing with FITTEST”. In: Proceedings of the European
Conference on Software Maintenance and Reengineering, CSMR (2011), pp. 355–358.
issn: 15345351. doi: 10.1109/CSMR.2011.51.
[40] Tanja E J Vos et al. “The FITTEST Tool Suite for Testing Future Internet Applications”.
In: Future Internet Testing: First International Workshop, FITTEST 2013, Istanbul,
Turkey, November 12, 2013, Revised Selected Papers. Ed. by Tanja E J Vos, Kiran
Lakhotia, and Sebastian Bauersfeld. Cham: Springer International Publishing, 2014,
pp. 1–31. isbn: 978-3-319-07785-7. doi: 10.1007/978-3-319-07785-7{\_}1. url:
http://dx.doi.org/10.1007/978-3-319-07785-7_1.
[41] Daniel R. Williams, Peter Thomond, and Ian Mackenzie. “The greenhouse gas abatement
potential of enterprise cloud computing”. In: Environmental Modelling and Software
56 (2014), pp. 6–12. issn: 13648152. doi: 10.1016/j.envsoft.2013.11.012. url:
http://dx.doi.org/10.1016/j.envsoft.2013.11.012.
[42] Claes Wohlin et al. Experimentation in Software Engineering. Springer International
Publishing, 2012.
[43] Lian Yu et al. “Testing as a service over cloud”. In: Proceedings - 5th IEEE International
Symposium on Service-Oriented System Engineering, SOSE 2010 (2010), pp. 181–188.
issn: 09505849. doi: 10.1109/SOSE.2010.36. arXiv: 0402594v3 [cond-mat].
[44] Yuen Tak Yu and Man Fai Lau. “A comparison of MC/DC, MUMCUT and several other
coverage criteria for logical decisions”. In: Journal of Systems and Software 79.5 (2006),
pp. 577–590. issn: 01641212. doi: 10.1016/j.jss.2005.05.030.
[45] Hong Zhu, Patrick a. V. Hall, and John H. R. May. “Software unit test coverage and
adequacy”. In: ACM Computing Surveys 29.4 (1997), pp. 366–427. issn: 03600300. doi:
10.1145/267580.267590.

39
A INTERVIEW DEVELOPER CYBERCOM

Har du arbetat med utveckling av applikationer som använder sig av data i molnet?

Ja

Vilken typ av system har du utvecklat?

Det är en integration mellan verksamhetssystem, procapita, som är utvecklat av Tieto som nästan
alla kommuner använder. Det används för att synca datan, det är alltså mastern, procapita, och så
är det många program som är beroende av den datan. Därför är det då naturligt att man använder
det om det då kommer en ny ide exempelvis, då fyller man i det där och då ska det syncas så det
finns motsvarande på google. Det är det jag gjort egentligen. Med användargrupper exempelvis,
så man får en elevgrupp och en lärargrupp på varje skola. När jag säger grupper menar jag
egentligen sån där sendlista, google groups har ni säkert hört talas om. Det är ju en sån då, och
så skapas det en för varje klass också, det är väldigt smidigt när de har google classroom så
kan man lägga till hela klassen på en sendlista. Och så syncar man naturligtvis namn och alla
runtliggande personuppgifter och så där. Det är mycket, det jag tycker är spännande med det
där är framför allt att det är väldigt givet att finns det en person som heter Petter i procapita ska
han heta Petter osv. Det är rätt självklart, inte så mycket som kan hända men där kommer det
massa sånt där som exempelvis nu häromdagen bara hade jag en elev i komvux just, där har man
så där kurs för kurs för kurs, där är det inte i klass som i gymnasiet eller vanlig skola. Deras
behov ser ju lite annorlunda ut, då kan det vara nån som jobbar ett heltidsjobb och så läser han
komvux liksom och då är det väldigt viktigt att google-kontot finns färdigt när de kommer på
första introduktionsdagen, det kanske är enda gången man träffar eleverna förutom på uppropet
sen. Så då har vi ju specialregler för just komvux att de får sina konton 30 dagar tidigare än alla
andra elever. Mycket såna affärsregler liksom och då är det väldigt spännande hur man hanterar
dem, det är ju inte alltid enkelt. Från att man får en konkret problemställning till att man bryter
ner det i krav då. Sen att man har nåt som är övertyckligt och framförallt testbart sen också, se så
att man kan testa alla möjliga utfall då, det är inte alltid enkelt.

Hur testades då systemet?

Ja det är ju det, det är ju unit-tester som är grunden då. Det kanske inte säger så mycket egentligen
utan det är ju snarare så att man försöker använda sig utav någon sån form av mockning. Det
finns ju nästan inget bra verktyg som gör detta, första steget av systemet så startar du ett konto,
andra steget så ändrar du konto, tredje steget gör du så här med kontot så nästa steg är man här
baserat på vilka punkter som redan startats osv. Då är man i liksom i en, vad ska man kalla
det, det är ju inte stateless liksom. Man har ju steg för steg som bygger på tidigare och så och
då måste man ju ha ett testramverk som stödjer det och så, som kommer ihåg vilka tester som
gjorts tidigare. De blir ju väldigt beroende av varandra. Det blir ju väldigt svårt, då kan man
ju inte bara mocka att man får typ 200 OK tillbaka eller nåt annat sånt. Du får ju inte det där
fullständiga testet som du skulle vilja ha, men däremot kan man ju alltid, du kan ju se, du kan ju
liksom skriva ner till en log-fil och så. Nu skickar vi till den här systemprodukten istället och då
får du manuellt se att den här requesten ser rimlig ut, men också att requesten som leder till den
är rimlig samt att följden av requests är rimlig. Då undviker du de absolut vanligaste, om du
lägger till den här användaren 30 gånger då fattar du att det är fel. Det behöver du kanske inte

A-1
automatisera för att se. Men det är ju ändå så att man hade gärna velat ha så att man kan trycka
på compile och köra tester och veta att allt fungerar, men så lyxigt är det tyvärr inte. Det kanske
finns nåt sätt men inte vad jag har stött på.

Vad är de huvudsakliga skillnaderna, angående testning, mellan en applikation som an-


vänder sig av en egen databas och en som använder molnet istället? Jag misstänker att det
är det som är själva grejen då att du har ju inget sånt här vad är det nu det heter?

Johannes - Roll-back?

Ja exakt, när man roll-backar och har en transaktion, det får man ju inte riktigt, det hade varit
väldigt skönt om man hade haft nåt sånt liknande. Men det är inget jag sett google stödjer alls, jag
tror inte google är unikt dåliga på något sätt heller utan det är något generellt med molntjänster.

Johannes - Ja, det finns molntjänster där man kan köpa så man får databaskontroll men det är en
annan nivå då.

Ja, det är lite flummigt att säga molntjänster. En databas kan ju vara en molntjänst och så där.

Johannes - Google stödjer inte det då iallafall.

Nej inte som jag har förstått det. Det finns nån sån där tredjepartssystem som heter google mock
eller vad det hette men det är inget vi riktigt använt oss av. Vi vet ju inte om vi kan lita på det
och så där, ska vi börja testa testbiblioteket liksom? Från nån okänd utvecklare på github och så.

Johannes - Sen måste man ju kunna lita på att googles system stämmer.

Ja, det gör det ju absolut oftast.

Johannes - Det är framförallt ens koppling till det som måste stämma.

Ja exakt.

Hur har du då hanterat problem som att du skickar datan till google och har ett API att
det beter sig som en black box och att du inte vet vad du har där i?

Mm, nja, vad ska man säga? Det är väl inte så mycket egentligen, jag ser det inte som ett så stort
problem egentligen. Jag tänker som när du gör när du ska skriva till en fil på systemet exempelvis,
då har du ju abstraherat bort filsystemet och så här liksom. På samma sätt känner jag om själva
kontoskapandet, om jag bara skickar en create så bryr jag mig egentligen inte om att det är en
svart låda eller vad som händer där i bakgrunden. Bara att den specificerar vad som är förväntad
output baserat på input så ser jag det nog inte som ett större problem egentligen. Sen är det ju
alltid väldigt obehagligt och man vill ju gärna ha sudo, koll på bit för bit.

Hur har du hanterat asynkrona, flertrådade, begäran från klienterna om det finns såna?

Ja, det funkar väl just nu som så att den fyller på en lång kö med en massa konton, flaskhalsen är
ju då att skapa allt, nätverket givetvis. Så den fyller på en stor kö, den blir jättestor, jättemånga
användare och sen sätter den upp vad det nu är, 7 trådar eller nånting som bara börjar peppra

A-2
konton till google. Och samla alla requester som är likadana. Det är ju en blocking queue som
används, det är ganska enkelt, inget komplext. Inga mutexes och sånt, eller ja, det kanske är det i
en blocking queue i grunden och botten, men den löser ju problemet åt dig. De läser ju också
bara från samma kö, inget att hålla på och skriva och så där så att det kan bli race conditions och
sånt där.

Har det blivit problem med att många använder systemet och att det uppstår saker ni inte
kan återskapa?

Jaja, det var det du tänkte på mer kanske? Flera användare på samma system med trådar. Ja, det
har vi upplevt så, vi har ju delvis en syncsnurra som kör, och så har du ett administrationsverktyg.
Administrationsverktyget och det andra använder ju givetvis samma databas, det finns privat
databas, en lokal, och så finns det ju google som skriver till samma lokala databas. Då blir det
problematiskt om man säger då att vi säger att syncen har hängt sig så det tar extremt lång tid när
nån kommer in klockan sju på morgonen och ska göra en snabb fix liksom och då håller den
redan på och använder databasen. Det har vi aldrig sett att det händer men däremot har det hänt
att nån glömt stänga av programmet på kvällen, då har inte det sparats ner som det ska göra
vid avslutning och då körs programmet när syncen ska starta igång där på natten sen. Så vi har
löst det är att om programmet är startat kan man inte starta programmet igen, det är låst till en
användare liksom. Det är en superenkel modell att den skapar en fil i en mapp och existerar den
filen går det inte att starta programmet, för då får man avsluta det korrekt.

Du pratade innan om konton att det var mycket användarkonfigurering med namn och
personuppgifter. Hur hanterar ni det med testning om det blir clashar och sånt när ni ska
skapa google-konton?

Det finns ju ganska specificerat vad som ska finnas och så där. Så man har bara en regel som
dem måste följa, är det ett ’ä’ eller nånting så blir det ett ’a’ istället.

Johannes - Om det är dubletter så flera heter samma sak?

Då är det bara en räknare som snurrar på, så då blir det andersandersson2 och så vidare.

Johannes - Och vid testning av det så har ni testat det på nåt sätt? Eller stött på några problem?

Nej, vi har aldrig fått problem med det riktigt. Mer än att det har körts i typ ett och ett halvt år
har det aldrig testats i utvecklingsskedet.

A-3
B EXAMPLE UNIT TESTS FROM TEST SUITE

B.1 EventExists
[TestMethod]
public void EventExists()
{
Dictionary<string, List<ITask>> tasks = new Dictionary<string,
List<ITask>>();
tasks.Add(testPersonNr, new List<ITask>());

//Create event id based on run time


String eventID = GenerateEventID();

//Add two create event tasks with the same ID to the synchronizer queue
tasks[testPersonNr].Add(new CreateEvent(){EventID = eventID});
tasks[testPersonNr].Add(new CreateEvent(){EventID = eventID});
calSyncer.AddTasks(tasks);

//Get number of warnings in the log before performing tasks


int nrLogsBefore = Logger.Instance.GetNumberOfLogMessages("warning");

//Setup the environment to contain a task with specific ID to test against


TaskResultCodes resCode = calSyncer.PerformFirstTask();
Assert.AreEqual(TaskResultCodes.Success, resCode);

//Try to create an event with the same ID that was already created and
check its return code
resCode = calSyncer.PerformFirstTask();
Assert.AreEqual(TaskResultCodes.Fail_IDExists, resCode);

//Check that the logs contain a new warning


Assert.AreEqual(nrLogsBefore + 1,
Logger.Instance.GetNumberOfLogMessages("warning"));

//Check that the queue contains a task


Assert.AreEqual(1, calSyncer.GetTasksInQueue());

//Check that the first task is an update task


var firstTask = calSyncer.GetFirstTask();
Assert.AreEqual(TaskTypes.UpdateEventTask, firstTask.TaskType);

//Compares the data of the generated UpdateEvent task to the CreateEvent


task
CheckTaskData(firstTask, tasks[testPersonNr].Last);

//Cleanup -- Store all created event IDs that needs to be removed when
using SaaS
createdEvents[testPersonNr].Add(eventID);
}

A-5
B.2 NewActivityAdded
[TestMethod]
public void NewActivityAdded()
{
var setupSchedule = LoadSetupSchedule(setupDataPath);
var testSchedule = LoadTestSchedule(testDataPath);

string testTime = GetTestTime()

//Setup start environment


SetupEnv(setupSchedule, testTime);

//Check that the correct events exist


Assert.AreEqual(true, CheckEvent(ID1, EventDataInfo));

scheduleComparer.GetChanges(testSchedule, setupSchedule);
foreach (var taskList in scheduleComparer.tasks)
for (int c = 0; c < taskList.Value.Count; c++)
taskList.Value[c].EventID = taskList.Value[c].EventID + testTime;

calSyncer.AddTasks(scheduleComparer.tasks);

//Add created tasks to cleanup when using SaaS


AddToCleanup();

while (calSyncer.GetTasksInQueue() > 0)


calSyncer.PerformFirstTask();

//Check that the correct events exist


Assert.AreEqual(true, CheckEvent(ID1, EventDataInfo));
Assert.AreEqual(true, CheckEvent(ID2, EventDataInfo));
Assert.AreEqual(true, CheckEvent(ID3, EventDataInfo));
}

A-6
C EXAMPLE MOCK-OBJECT UNIT TEST

[TestMethod]
public void CreateEventDailyLimit()
{
//Trigger daily limit in database
mockConnection.TriggerDailyLimit(true);

Dictionary<string, List<ITask>> tasks = new Dictionary<string,


List<ITask>>();
tasks.Add(testPersonNr, new List<ITask>());

//Create event id based on run time


String eventID = GenerateEventID();

//Create task and add to calendar syncer


tasks[testPersonNr].Add(new CreateEvent(){EventID = eventID});
calSyncer.AddTasks(tasks);

//Get number of warnings in the log before performing tasks


int nrLogsBefore = Logger.Instance.GetNumberOfLogMessages("warning");

//Perform task and see if it succeeded


TaskResultCodes resCode = calSyncer.PerformFirstTask();
Assert.AreEqual(TaskResultCodes.Fail_UsageLimits_DailyLimit, resCode);

//Check that the queue now contains an create task


Assert.AreEqual(1, calSyncer.GetTasksInQueue());

//Check the first task


var firstTask = calSyncer.GetFirstTask();
Assert.AreEqual(TaskTypes.CreateEventTask, firstTask.TaskType);

//Check so that a warning has been added to the logger


Assert.AreEqual(nrLogsBefore + 1,
Logger.Instance.GetNumberOfLogMessages("warning"));

//Check that backoff was activated and failedattempts set to 1


Assert.AreEqual(1, calSyncer.GetFailedAttempts());

//Disable daily limit after the test


mockConnection.TriggerDailyLimit(true);
}

A-7
Blekinge Institute of Technology, Campus Gräsvik, 371 79 Karlskrona, Sweden

You might also like