Cloud does not equal the Internet
November 13th, 2018 by Heather Maloney
People frequently use the term “cloud” or “in the cloud” to simply mean located on the internet or their private intranet. I’ve done it myself, for the sake of expedience. However, they aren’t the same thing, and it’s important to understand why so that you can wisely choose the internet services you are accessing for your business or in your personal life. For example, cloud hosting is not the same as shared web hosting, for the hosting of your organisation’s website.
The internet is the connection of computers around the globe using TCP/IP protocol to manage the connections, and participating in the sharing of information using the HTTP protocol (the worldwide web).
The term “Cloud” or “Cloud Computing” refers to technology services, usually delivered over the internet, which are characterised by:
- Distribution of a system (program and its data) across many servers and locations, to provide for greater performance, but still providing up to date and correct data.
- Automatic provisioning (addition of greater capacity via more CPUs, memory and disk space) to meet minute-by-minute requirements.
Applications embodying cloud computing are often further labelled as SaaS (software as a service), PaaS (platform as a service), IaaS (infrastructure as a service), and other ‘…aaS’ names. These labels draw attention to which part of the abstraction of the technology is controlled by the buyer compared to the service provider. However, not all applications given these labels actually provide the two main characteristics that I am asserting differentiates cloud computing ? distribution across many servers, and automatic provisioning. Instead, software delivered as a service via logging into a web application may in fact be stored on one server, in one location, with one database, and require the service provider to manually procure and set up new servers when usage demands the additional resources.
The characteristics of cloud technology provide advantages and disadvantages which I will discuss in a moment, but let’s first consider the technological challenge cloud technology is trying to solve.
As you can imagine, there’s an awful lot of information contained within Facebook. Millions of users each adding several posts, and making hundreds of comments, on a daily basis, adds up very, very quickly. Not only is there a lot of data being stored and accessed by users of Facebook, people are posting and reading comments from all around the globe; some on their phones while riding on a train, others are sitting at their desktop computer in the back of beyond, and everything in between. No one will use Facebook if it takes more than a few seconds for the content to appear on their screen, and Facebook is used by people all around the globe. Facebook is just one example of an application which handles vast amounts of data and serves vast numbers of people.
To make Facebook possible, as well as other applications like it, the underlying technology has to be distributed across multiple servers and locations – a distributed system. There are numerous technical models used to achieve a distributed system. Below are brief descriptions of just a few of the techniques to give you a feel for the complexities involved.
Techniques
Sharding. A term allowing a single database to be stored across multiple servers by allocating logical portions of the data onto different servers. A very rudimentary example would be determining which server to store the data based on a range of identifiers such as in the case of user accounts the decision could be made to store all data with a user ID between 1 and 100,000 on server 1, between 100,001 and 200,000 on server 2, and so on. The application retrieving the data would send the query to the application server, and then the database server would work out which server to get the data from based on the user ID, get that data and return it back to the user. There are many options for the way that a database may be divided; the right way for a particular application will need to consider the way current data is spread across it’s attributes as well as how future data may grow.
NOSQL. The example given above for sharding described separation of data contained within a relational database; the most common database architecture up until very recently, which as the name suggests relates tables of information to one another by linking IDs. A person’s record in a database may contain an ID to another table storing the details of the school they attend (name, address, phone number etc) – hence being called a relational database. NOSQL or Document Databases have become more popular recently as they can be spread more easily across multiple servers because all the data associated with the person in the sharding example would be stored in one document rather than spread amongst related tables. Document databases often come with functionality built into them to manage distributing documents in the collection across multiple servers.
Caching. Storing data located near to users, providing faster access particularly for commonly used information is referred to as caching. Facebook makes heavy use of memcache to store recently accessed Facebook information in memory, which is much faster to read than from the Facebook MySQL database which is housed on hundreds of thousands of servers. Content Delivery Networks (‘CDN’) are an example of caching of web content to ensure it is closer to your website visitor.
Other concepts such as virtualization, utility computing, and grid computing are also key in the implementation of cloud computing particularly with regard to auto-provisioning of additional computing resources.
Advantages
We have touched on some of the advantages of cloud computing in relation to the problems it is trying to solve. The advantages can be summarised as:
- Security. A cloud solution must be focused on security in order to have success over the long term, and they usually have significant resources at the ready to keep security up to date, and respond quickly when a new threat arises. Look for:
- End-to-end encryption which ensures the encryption of all data in-transit across the Internet and stored at-rest in the cloud, with the encryption keys held by you and used to encrypt the data before it leaves your computer.
- Sophisticated access controls allowing you to set role-based authentication to control what exact data each user can and cannot view, edit or share.
- Performance. Because there is likely a server nearby to the user, rather than the user’s request needing to travel half way around the world and back, you can expect the speed of cloud systems to be significantly better. Performance is a key factor for organisations with a workforce distributed around the globe.
- Scale. The ability to distribute an application and/or its data across multiple servers and locations removes or significantly reduces the constraints on how large an application can grow or how many customers it can efficiently serve.
- Cost. Another key benefit of cloud is that usually someone else is responsible for concerns such as installation of software and purchase of licenses, management of software patches, backups, hardware upgrades and repairs, anti-virus and protecting against malicious attacks, all handled by the provider of cloud computing rather than the organisation requiring the technology. When comparing the cost of cloud and non-cloud you must take into consideration the total cost of ownership of the alternatives. Auto-scaling (also referred to as elastic computing) is a factor in both cost and performance, as it allows systems to scale up (additional costs) when demand increases, and scale back (reduce costs) when demand is low, allowing the owner of the system to only pay for resources when they are required.
Disadvantages
It is important to also be aware of the potentially significant disadvantages of cloud computing:
- Data ownership / sovereignty. Where is your data really? Who has access to it? Have you read the terms and conditions with respect to the ownership of the data? Can you remove your data permanently, or will it still be accessible by the cloud provider even after your account is closed? Often the owner of the data you place into a cloud computing solution is actually the cloud provider, not you. To help mitigate this issue, some cloud providers are implementing servers in additional countries including Australia, to help organisations to use cloud services without moving their data overseas, but you need to check where your data is stored; often such storage choice will increase the cost of the solution. NB: even if your data starts out being stored in Australia, if the data is owned by a US company, they may be forced to move the data back to the US for scrutiny by American law enforcement agencies – this has already happened in the case of Google in February 2017.
- Privacy. Facebook has been criticised at the highest levels of American government, and by governments around the world, for the way in which the data it gathers (albeit via their free service) has been used and sold on to 3rd parties. The situation with Facebook and other cloud solutions has been a factor in leading to the new European privacy legislation (GDPR). When you utilise cloud platforms, are you comfortable with manner in which they use the data that you are storing within it (read their terms and conditions)? Can you trust the organisation to abide by their promises?
- Control. Can you create the functionality you need to support your particular processes, or are you now constrained by the services provided by the cloud platform? Using a cloud service to remove the need to create that service constrains you to the functionality the service offers. The more you depend on a 3rd party service, the less likely you are to be able to innovate in that area of your business on application, which may well slow your organisation down and remove your opportunity to create competitive advantage.
- Cost. Whilst being able to pay per second for your application using cloud technologies may sound like it is going to reduce your cost, if your application isn’t built to take advantage of cloud technologies, the opposite may occur and your costs can be significantly more than using simpler internet technologies. Cost can also be significantly greater if you use the wrong technology on the wrong cloud provider. For example, whilst the major suppliers of cloud technology usually allow you to run any type of application on their cloud servers, the cost of running those different types may be very different. Running a MS SQL database on Google Cloud is extremely expensive, for example, compared to running it in the Microsoft Azure platform. You need to choose your technology wisely.
- Skills. Not everyone developing applications is experienced in working on large scale applications, and the implementation of applications using cloud technologies is relatively new, so finding personnel with the required skills can be very challenging.
Whilst I have primarily been discussing cloud computing from the point of view of building an application such as Facebook, cloud computing underpins solutions such as Office365, DropBox and GSuite. These applications allow users all over the world, sometimes the one person in different parts of the world in one day, to access their data – emails and files for example – and programs such as GSuite and Word Online, with great performance, and without the data being [noticeably] out of date, most of the time. Such applications are also increasingly providing users with the capability to collaborate on files e.g. contributing to an online document simultaneously, again while located in different cities and countries.
For such commodity type applications, where easy access from anywhere, across multiple devices, makes business much easier, the decision to sign up for cloud computing may feel like a no-brainer. But you still need to consider the disadvantages discussed above.
In summary, not all internet applications are using cloud computing technologies. Cloud computing is a complex area, utilising multiple strategies aimed at providing up to date information, to mass users all around the world, with great speed. It is important that you way up the advantages and disadvantages of cloud computing for both your commodity technology needs (email, file sharing, file storage, and other operational systems) as well as when developing your own applications.
If you would like to read more:
https://enterprisersproject.com/article/2017/1/three-things-companies-must-know-about-data-sovereignty-when-moving-cloud
Use of Memcache by Facebook: https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf