Amazon Web Services hopes enterprises are willing to rely on the company when getting large amounts of data ready for analysis with its latest service, Kinesis.
Kinesis was first made available as a limited preview last month, but is now available as a public beta. It is a managed service designed to handle real-time streams of data. It can collect and process large amounts of data, from any number of sources, including server logs and social media feeds, according to Amazon.
The pitch is the same as for many of Amazon’s other hosted services. In this case, enterprises don’t have to worry about provisioning, deploying and maintaining hardware and software to capture and store real-time data. Kinesis also replicates information across three facilities in a region, to improve availability and data durability.
Amazon sees a number of use cases for Kinesis; the service can collect data generated by an application and make it available for identification of slow queries, page views or resource utilization. Kinesis can also collect and analyze financial information in real-time or help game developers see how the players are interacting with their game and each other.
The basic concept of Kinesis is a stream of data that is fed into and then outputted from the service. Each stream is made up of what Amazon calls shards. They can capture up to 1MB of data and 1,000 transactions per second. Applications linked to the service on the other side can read data from each shard at up to 2MB per second.
Deciding on the number of shards needed is the first step in the configuration process, and the Kinesis console includes a wizard to help with that. If a stream doesn’t have the capacity needed to handle all the information sent to it, data is either delayed or discarded. IT staff can dynamically resize a stream by splitting or merging shards while the stream is in use. For a given stream, each change takes a few seconds, and only one change can occur at a time, according to an FAQ published by Amazon.
Another important building block of the service is the Kinesis Client Library, which acts as an intermediary between the business application that has the logic needed to process data, and the Kinesis service itself. Each Kinesis application has a unique name and operates on one specific stream. The client library is currently available in Java, but Amazon is planning add support for other languages.
How much Kinesis costs depends on the number of transactions, which can be no larger than 50KB, fed into the service and the number of shards used. One million transactions cost $0.028 and each shard costs $0.015 per hour. For now, Kinesis is available in the U.S. East (Northern Virginia) region.