Monday, June 27, 2011

CloudWatch Custom Metrics for better decision making...

AWS Cloudwatch is used mostly to monitor AWS resources. The most widely used monitoring are system metrics like CPU utilization, memory consumtion, DISK and Network IO. Based on these information, we can take decisions to change the environements to cater to demand fluctuations. Though traditionally these information were primary source of predicting load behavior or system performance, in today's world there are better business parameters which can be used to forecast demands.

For example no of active user sessions, user experience or response time, no of transactions etc can also be equally important to analyze the performance and take decisions to scale up or sclae down resources.

Even business factors like there is expected surge in demand like there is a increase in rain fall or sudden decrease in temperature or sudden change in a match result, can also predict user demands. Although these parameters are computed outside the AWS systems, these can be fed to Cloudwatch to forecast demands and scale up and scale down resources and ensure better user performance.

This can be achieved by using the new features of using custom metrics in CloudWatch.

AWS Resource, Applications, Business related data can be fed to CloudWatch and then these metrics can be used by autoscaling to take decision to scale up or down. Also, these data can be fed to monitor graphs or enable notifications.

The recent release of cloudwatch APIs are supporting this feature and is avalialble in .Net, Java, PHP SDKs.
Here is a blog from AWS for some more information http://aws.typepad.com/aws/2011/05/amazon-cloudwatch-user-defined-metrics.html

Amazon AWS ecosystem

As more and more organizations trying to adopt cloud solutions or hosting solutions on cloud, the ecosystem around cloud providers are are emerging strong. There are software and services providers, who are constanstly innovating to ensure the cloud adpotion becomes smoother. That's what I thought will showcase in this blog to understand what tools and technologies available around major cloud providers. Undoubtedly Amazon AWS is one the major IAAS providers today that organizations are looking at for hosting solutions. As Amazon AWS is there for a long period of time, the ecosystem around it has also been built strongly.

I will show from the perspective of a typical application hosting on Amaon AWS. For an application to be hosted on cloud, there are few concerns that need to be address. These concerns can be termed as key requirements for managing the lifecycle of an application. Let's look at some of these key requireements are of course Availability, Scalability, Deployment & configuration Management, System Management and Monitoring.

Availability - Amazon AWS provides multiple availability zones inside each region. Applications components can be deployed across multiple AZs to ensure it survives when disaster strikes.

Scalability - Amazon AWS provides auto scaling to ensure the provisioning and deprovisioning of resources as demand increases and reduces. Also, AWS provides load balancing to ensure the load is distributed across the resources for a better performance.

Deployment & Configuration Management - Deploying software platform and applications, configuring them, managing multiple versions of the deployment are always challenging. It is necessary to create an environment that is repetitively deployable and easily configurable.
 
The diagram above depicts a typical Amazon AWS environment along with ecosystem tools.

There are tools which helps create templates describing an environment. An environment can be defined using a template e.g. no of servers, amount of storage, whether it needs a load balancer and in which availability zones etc. So, when templates are deployed, it automatically provisiones the resources and create an environment. Software like chef/puppet can be used to install software platforms and configure them as needed.

Some of these tools are
System Monitoring & Management - Cloud resources like servers, virtual machines, storage should be monitored along with application components to ensure high availability. There are tools available which provides plugins to monitor such objects like Zenoss (www.zenoss.com)  and Nimsoft (www.nimsoft.com) . These tools can be used to monitor hybrid kind of environments having cloud and non cloud resources. Cloudwatch monitoring from Amazon AWS is available to monitor specifically AWS resources.

Wednesday, September 15, 2010

Cloud This Week - Sept 18

Here is a brief headlines on what happened this week in the cloud.

Amazon AWS

1. Amazon AWS has released micro instances with 613 MB of memory that can support 32-bit and 64-bit platforms. This will cost $0.002 for linux and $0.003 for windows instance. The micro instances can be used for applications or components of applications that will need less resources.

http://aws.amazon.com/about-aws/whats-new/2010/09/09/announcing-micro-instances-for-amazon-ec2/

2. AWS Console support for Virtual Private Cloud. VPC is to integrate amazon cloud with in house infrastructure. The in house security features can be extended to VPC. The instances in amazon are isolated and are not accessible directly from internet and they can be accesses through only company firewalls. Even subnets can be created in the amazon cloud and the internal IP addresses can be assigned to these instances.

The new console from amazon will help configure the VPC cloud, making it more user friendly and manageable.

http://aws.amazon.com/vpc/

3. Amazon released it's own version of Linux on AWS. This instance contains bare minimum services or applications to start with and amazon will keep upgrading or providing patches regularly. Amazon says it is optimized for EC2 environment and comes with some basic set tools for working in AWS.

Here is a link to the Amazon Linux Guide.

Google App Engine

1. Google App Engine now support multi tenant applications using namespace APIs. Now the same applications can be used by multiple tenants and their data will be segregated and isolated from each other. This can be done using different namespaces for each client organization. For data operations, the namespace should be used. The data viewer in the appengine management console also have a namespace field to view data for each clients separately.

2. 1000 entities for query limit is removed. Now the query will return end of the result or until the query reaches timeout limit. 

Private Cloud

1. Vmware and HP have come together for and end to end private cloud offering. The highlights of this alliance are
  • HP Insight control is integrated with Vmware vCenter to provide single management infrastructure for both physical and virtual infrastrcure.
  • The storage attached to HP Proliant G6 servers and inside it can be converted into virtual pool of storage and made available to virtual machines. This removes requirement for external storage like SAN. This will help implement high availability using VMware vCenter without the need for an exernal storage.

Sunday, August 22, 2010

Why large organizations may not find public cloud so attractive?

Large organizations, may not find public cloud so attractive for the reasons below.

1. Disruption: Some applications can not be migrated as it is. The applications may need to be modified in order to accommodate to a particular public cloud technology or architecture. This will need investment and may cause disruption to the on going business.

2. Vendor Lock-in: As little standards available in the cloud, most of the public cloud implementations are proprietary. This means there will be a vendor lock-in once you develop for a particular public cloud. Migrating from one cloud to another will be challenging in future, once applications are locked-in to a particular cloud technology and architecture.

3. SLA – Service levels offered in terms of availability, security or performance may not be acceptable to some of the organizations. Organizations that required higher level of service levels, will find it inappropriate in spite of the cost advantage promised by the public cloud providers.

4. Security or Compliance Constraints – There may be some security or regulatory compliance issues, which may bar applications and data to be moved onto a public cloud. Some regulatory compliances may need data and applications to be restricted to specific premises or regions or geographies.

5. Integration Challenges – Most of the organizations have many applications running in their premises that are interdependent or tightly coupled. Moving some of these applications to public cloud may impose serious integration challenges from security or technology perspective. As most of the organizations have strict firewall and data communication policies, integrating applications, which reside either sides of a firewall, may be impossible.

The above issues are major challenges in considering moving to public cloud, especially for medium and large organizations. Neverthless, there can be some specific business case, where large organizations may find public cloud very appropriate.

Large organizations, may opt for private cloud or hybrid cloud, as they will have the money to invest to create this model.

Wednesday, July 14, 2010

Solutions on Cloud 4: High Performance Computing

I am writing about this solution because of the announcement of Amazon's Platform for High Performance Computation this week. 

Hadoop framework has been popular for sometime for running complex and time consuming jobs like indexing large amount of files or data ( millions or billions of files), data mining, running analytics for financial or bioinformatics analysis. Here is a list of real life use cases for Hadoop. Hadoop uses Map Reduce algorithm, to divide the work among hundreds or thousands of systems and then combining the results to produce the final output.

But the practical problem of running Hadoop is provisioning of large amount of servers. It may not be possible for most of the organizations to actually allocate so many servers for limited amount of time, to run these programs. Amazon solves the problem by providing a platform for running Hadoop, called Amazon Elastic MapReduce. It abstracts the configuration and administration of servers and the Hadoop framework so that developers can focus on developing map reduce programs to solve their problem. Amazon allows to provision as many server they want for the amount of time they require.


Sunday, July 11, 2010

Solutions on Cloud 3: Disaster Recovery

Building a DR site for your critical applications and data, can be an expensive affair. Especially if you are a small or medium sized organization. It calls for investment in space, energy, hardwares, personnel etc. 

Todays' cloud vendors provide an infrastructure that can be leveraged to build a DR solutions at a very nominal cost. Building such solutions are definitely within the reach of SMEs and the benefits it can provide outweighs the cost.

There are three options for building DR solution.

1. Deploy the primary solution on your own data center and have the backup components running in the cloud. This deployment architecture can be used for cloudbursting. 

2. Deploy both the primary and backup components in the same cloud infrastructure. For this the  cloud vendors need to have multiple data centers, which are isolated from each other.

3. Deploy the primary components on one cloud vendor, while deploying the secondary components on the other cloud vendor. 

Let's looks at some of the cloud vendors infrastructure and how it can help build a DR solution.


Amazon AWS has data centers in four regions ( us east, us west, europe, asia-pacific ). Each region has multiple availability zones. Each availability zones are independent data centers. Each availability zones have their own servers, own power supply and completely isolated from failures of any other data centers.

But the availability zones are interconnected with high bandwidth networks.



So, while primary components of your solutions running on one availability zones, the backup components can run either in active or passive mode in other availability zones. The AWS load balancers can route traffic to multiple availability zones in the same region, thereby actively using all components in all availability zones. But the solutions should have enough intelligence built into it, to initiate takeover when a component fails.

Amazon AWS S3, provides highly durable storage infrastructure, by redundantly storing data in multiple availability zones. S3 data can also be copied over to other regions. Amazon AWS EBS snapshots can be taken and stored in S3 for later retrieval.


GoGrid's multiple-datacenter feature also provides multiple data center. One in US East and one in US West that can be used to deploy a DR solution.




Rackspace does not have multiple data centers yet, but you can use it for deploying your DR site while running the primary components on your own data center. Rackspace's DR Service provides infrastructure and value added services like professional services for building DR solution for your organization. And these DR solutions can be customized as per organization needs.

Similary, the DR solution approach 1 or 3 (as specified above) can be built using any of the cloud vendors. 

One of the most important factors to consider is the SLA contract with cloud vendor. It will provide insights into what services the cloud vendor can provide or can not provide and whether it will fit your DR solution requirements.

Please post your comments and feedbacks.

Saturday, July 10, 2010

Solutions on Cloud 2: Storage

One of the solutions that organizations can take advantage of in the cloud is storage. Opting for storage solutions in the cloud means cost savings in hardware, personnel and physical storage space. Managing storage devices, monitoring for storage capacity planning, managing tape libraries are expensive tasks for small and medium sized businesses. These services are available as Storage as a Service from multiple cloud vendors.


Storages that are available as services are can be any of the following types


1. Disk based storage - Raw data storage. Cloud vendors provide features like high availability, automatic fail over, regular backups, scaling on demand. These features make storage as a service very attractive. Small and medium sized organization can avail these features for nominal charges, for which they would have spent hundreds of thousands of dollars.


2. Databases - Databases on demand. Features like running databases in clustering or master-slave mode, regular backups and automatic recovery from failures are definitely appealing features to opt for these services.


3. File Storage - Files can be uploaded and shared with others. The ACL for files can be set for different levels of access for users. 


4. Building content delivery networks - Delivery contents faster to users that span across geographies need lots of investment in building a content delivery network. But if CDNs are available at a fraction of this cost, then it becomes irresistible. Now any website can use these CDNs to improve performance and reach their users globally faster.

5. Modern distributed databases (noSQL databases) - These schema-less databases are gaining popularity for building solutions that are horizontally scalable. 


Here are some of vendors that are providing these services









Vendors Disk based Storage Distributed Databases RDBMS File Storage CDN

Google
Datastore (BigTable)
Google Docs


Amazon AWS EBS SimpleDB RDS (mySQL) S3 CloudFront

Permabit Cloud Storage





Rackspace


Cloud Files Limelight

GoGrid Cloud Storage





Box.net


Cloud File Server


Microsoft

SQL Azure Azure Storage


Nirvanix hNode


SDN

Verizon Cloud Storage












Besides the above basic infrastructure level services, there are third party solutions which are built on top of these and provide value added services. One such solution, worth mentioning here, is Zmanda Cloud Backup solution. It provides a GUI to configure and manage backups, and automate the process to a large extent.


Please let me know more about other solutions in you comments and feedback and I will keep adding to the list.