Try Hazelcast IMDG Enterprise | Hazelcast.com | Blog
Sign up for hands-on training from Hazelcast experts, online or in the classroom.
Learn more at Hazelcast University
IAtomicLong,Hazelcast’s distributed implementation of java.util.concurrent.atomic.AtomicLong, offers most of AtomicLong’s operations such as get, set, getAndSet, compareAndSet and incrementAndGet. Since IAtomicLong is a distributed implementation, these operations involve remote calls and hence their performances differ from AtomicLong.
You can send functions to an IAtomicLong. The reason for using a function instead of a simple code line like atomicLong.set(atomicLong.get() + 2)); is that the IAtomicLong read and write operations are not atomic. Since IAtomicLong is a distributed implementation, those operations can be remote ones, which may lead to race problems. By using functions, the data is not pulled into the code, but the code is sent to the data. This makes it more scalable.
IAtomicReference, Hazelcast’s distributed implementation of java.util.concurrent.atomic.AtomicReference, offers compare-and-set and get-and-set operations on object references that are guaranteed atomic across application instances in a cluster.
ICountDownLatch, Hazelcast’s distributed implementation of java.util.concurrent.CountDownLatch, is a synchronization aid that allows one or more threads––in one or more application instances––to wait until a set of operations being performed in other threads across the cluster completes.
ICountDownLatch is initialized with a given count. The countDown() method is a non-blocking operation that decrements the count. When the count reaches zero, all threads blocking on the await() method are allowed to proceed.
IdGenerator is a distributed id generator that facilitates creating ids that are unique across application instances in a cluster.
Hazelcast List is similar to Hazelcast Set, but Hazelcast List also allows duplicate elements. Hazelcast List also preserves the order of elements. Hazelcast List is a non-partitioned data structure where values and each backup are represented by their own single partition. Hazelcast List cannot be scaled beyond the capacity of a single machine. All items are copied to local and iteration occurs locally.
ILock is the distributed implementation of java.util.concurrent.locks.Lock. If you lock using an ILock, the critical section that it guards is guaranteed to be executed by only one thread in the entire cluster. Even though locks are great for synchronization, they can lead to problems if not used properly. Also note that Hazelcast Lock does not support fairness.
Hazelcast Map (IMap) extends the interface java.util.concurrent.ConcurrentMap and hence java.util.Map. It is the distributed implementation of Java map. You can perfrom operations like reading and writing from/to a Hazelcast map with the well known get and put methods. In addition Search and Map/Reduce can be run on Maps. Finally, maps may be integrated with a database using MapStore.
Hazelcast MultiMap is a specialized map where you can store multiple values under a single key. Just like any other distributed data structure implementation in Hazelcast, MultiMap is distributed and thread-safe.
Hazelcast distributed queue is an implementation of java.util.concurrent.BlockingQueue. Being distributed, it enables all cluster members to interact with it. Using Hazelcast distributed queue, you can add an item in one machine and remove it from another one.
Unlike IMap, which is partitioned to balance data across the cluster, ReplicatedMap is fully replicated such that all members have the full map in memory. It replication is weakly consistent––rather than eventually consistent––and done on a best-effort basis.
ReplicatedMaps have faster read-write characteristics, since all data is present in local members and writes happen locally and eventually replicated. Replication messages are also batched to minimize network operations.
ReplicatedMaps are useful for immutable objects, catalog data, or idempotent calculable data.
Hazelcast Ringbuffer is a lock-free distributed data structure that stores its data in a ring-like structure. Think of it as a circular array with a given capacity. Each Ringbuffer has a tail, where the items are added, and a head, where the items are overwritten or expired. You can reach each element in a Ringbuffer using a sequence ID, which is mapped to the elements between the head and tail (inclusive) of the Ringbuffer. It supports single and batch operations and is very high-performance.
Hazelcast ISemaphore is the distributed implementation of java.util.concurrent.Semaphore. Semaphores offer permits to control the thread counts in the case of performing concurrent activities. To execute a concurrent activity, a thread grants a permit or waits until a permit becomes available. When the execution is completed, the permit is released.
A Set is a collection where every element only occurs once and where the order of the elements doesn’t matter. The Hazelcast com.hazelcast.core.ISet implements the java.util.Set. Hazelcast Set is a distributed and concurrent implementation of java.util.Set.
In Hazelcast, the ISet (and the IList) is implemented as a collection within MultiMap, where the id of the set is the key in the MultiMap and the value is the collection.
Hazelcast provides a distribution mechanism for publishing messages that are delivered to multiple subscribers. This is also known as a publish/subscribe (pub/sub) messaging model. Publishing and subscribing operations are cluster wide. When a member subscribes to a topic, it is actually registering for messages published by any member in the cluster, including the new members that joined after you add the listener.
ReliableTopic is backed by a Ringbuffer with a backup to avoid message loss and to provide isolation between fast producers and slow consumers.
An entry processor enables fast in-memory operations on a map without having to worry about locks or concurrency issues. It can be applied to a single map entry or to all map entries. It supports choosing target entries using predicates. You do not need any explicit lock on entry: Hazelcast locks the entry, runs the EntryProcessor, and then unlocks the entry.
Hazelcast sends the entry processor to each cluster member and these members apply it to map entries. Therefore, if you add more members, your processing is completed faster.
One of the coolest features of Java 1.5 is the Executor framework, which allows you to asynchronously execute your tasks (logical units of work), such as database query, complex calculation, and image rendering.
The default implementation of this framework (ThreadPoolExecutor) is designed to run within a single JVM. In distributed systems, this implementation is not desired since you may want a task submitted in one JVM and processed in another one. Hazelcast offers IExecutorService for you to use in distributed environments: it implements java.util.concurrent.ExecutorService to serve the applications requiring computational and data processing power.
With IExecutorService, you can execute tasks asynchronously and perform other useful tasks. If your task execution takes longer than expected, you can cancel the task execution. In the Java Executor framework, tasks are implemented as java.util.concurrent.Callable and java.util.Runnable. If you need to return a value and submit to Executor, use Callable. Otherwise, use Runnable (if you do not need to return a value). Tasks should be Serializable since they will be distributed.
In the case of special/custom needs, Hazelcast’s SPI (Service Provider Interface) module allows users to develop their own distributed data structures and services.
The SPI makes it possible to write first class distributed services/data-structures yourself. With the SPI, you can write your own data-structures if you are unhappy with the ones provides by Hazelcast. You also could write more complex services, such as an Actor library.
Based on the Hazelcast MapReduce framework, Aggregators are ready-to-use data aggregations. These are typical operations like summing up values, finding minimum or maximum values, calculating averages, and other operations that you would expect in the relational database world.
Aggregation operations are implemented on top of the MapReduce framework and all operations can be achieved using pure MapReduce calls. However, using the Aggregation feature is more convenient for a big set of standard operations.
Listener with Predicate enables you to listen to the modifications performed on specific map entries. It is an entry listener that is registered using a predicate. This makes it possible to listen to the changes made to specific map entries.
You have likely heard about MapReduce ever since Google released its research white paper on this concept. With Hadoop as the most common and well known implementation, MapReduce gained a broad audience and made it into all kinds of business applications dominated by data warehouses.
MapReduce is a software framework for processing large amounts of data in a distributed way. Therefore, the processing is normally spread over several machines. The basic idea behind MapReduce is to map your source data into a collection of key-value pairs and reducing those pairs, grouped by key, in a second step towards the final result. The main steps behind MapReduce are to read the source data, map the data to one or multiple key-value pairs, and then reduce all pairs with the same key.
Hazelcast partitions your data and spreads it across cluster of servers. You can iterate over the map entries and look for certain entries (specified by predicates) you are interested in. However, this is not very efficient because you will have to bring the entire entry set and iterate locally. Instead, Hazelcast allows you to run distributed queries on your distributed map.
If you add new members to the cluster, the partition count for each member is reduced and hence the time spent by each member on iterating its entries is reduced. Therefore, the Hazelcast querying approach is highly scalable. Another reason it is highly scalable is the pool of partition threads that evaluates the entries concurrently in each member. The network traffic is also reduced since only filtered data is sent to the requester.
Hazelcast provides a distributed second level cache for your Hibernate entities, collections and queries. This cache associates with the Session Factory object. This cache is not restricted to a single session, but is shared across sessions, so data is available to the entire application, not just the current user. This can greatly improve application performance as commonly used data can be held in memory in the application tier. Implied by the name, Hibernate will go to the first level cache first, and if the entity is not there, it will go to the second level.
Web and application servers can easily be made to scale out to handle huge loads by adding devices such as a load balancer. This has a second effect of providing redundancy. However for applications that use web sessions, this introduces a new problem. If a server goes down and the load balancer moves the user to a new server, the session is lost. The solution to this problem is to provide web session clustering. Open source application typically provide a way to plug in a web session clustering solution, but do not provide the clustering mechanism. Commercial application servers may provide a clustering mechanism but it is typically not robust and not performant.
Hazelcast provides this web session clustering. User sessions are maintained in the Hazelcast cluster, using multiple copies for redundancy. Hazelcast provides 3 solutions:
This plugin integrates Hazelcast data distribution framework into your grails application. You can reach distributed data structures (map, queue, list, topic) injecting hazelService. Also you can cache your domain class into Hazelcast distributed cache.
You may replace Ehcache with Hazelcast as secondary hibernate cache implementation.
Hazelcast JCS resource adapter is a system-level software driver used by a Java application to connect to an Hazelcast Cluster.
JCache is the standardized Java caching layer API. The JCache caching API is specified by the Java Community Process (JCP) as Java Specification Request (JSR) 107.
Starting with release 3.3.1, Hazelcast offers a specification compliant JCache implementation. It is not just a simple wrapper around the existing APIs; it implements a caching structure from ground up to optimize the behavior to the needs of JCache. The Hazelcast JCache implementation is 100% TCK (Technology Compatibility Kit) compliant and therefore passes all specification requirements. It has asynchronous versions of almost all operations to give the user extra power.
Hazelcast supports the Apache jclouds API, allowing applications to be deployed in multiple different cloud infrastructure ecosystems in an infrastructure-agnostic way.
Hazelcast AWS cloud module helps Hazelcast cluster members discover each other and form the cluster on AWS. It also supports tagging, IAM Role, and connecting clusters from clients outside the cloud.
Azure DiscoveryStrategy provides all Hazelcast instances in a cluster by returning VMs within your Azure resource group that are tagged with a specified value.
The Hazelcast Discovery is an extension SPI to attach external cloud discovery mechanisms. Discovery finds other Hazelcast instances based on filters and provides their corresponding IP addresses.
The SPI ships with support for Apache jclouds and Google’s Kubernetes as reference implementations.
Docker containers wrap up Hazelcast in a complete filesystem that contains everything it needs to run – code, runtime, system tools, system libraries – guaranteeing that it will always run the same, regardless of the environment it is running in.
You can deploy your Hazelcast projects using the Docker containers. Hazelcast has the following images on Docker:
Kubernetes is an open source orchestration system for Docker containers. It handles scheduling onto nodes in a compute cluster and actively manages workloads to ensure that their state matches the users declared intentions.
The Hazelcast Zookeeper Discovery plugin provides a service based discovery strategy by using Apache Curator for communicating your Zookeeper server for Hazelcast 3.6.1+ Discovery SPI enabled applications.
The on-heap store refers to objects that will be present in the Java heap (and also subject to GC). Java heap is the space that Java can reserve and use in memory for dynamic memory allocation. All runtime objects created by a Java application are stored in heap. By default, the heap size is 128 MB, but this limit is reached easily for business applications. Once the heap is full, new objects cannot be created and the Java application shows errors.
Hazelcast High-Density Memory Store, the successor to Hazelcast Elastic Memory, is Hazelcast’s new enterprise grade backend storage solution. This solution is used with the Hazelcast JCache implementation. By default, Hazelcast offers a production ready, low garbage collection (GC) pressure, storage backend. Serialized keys and values are still stored in the standard Java map, such as data structures on the heap. The data structures are stored in serialized form for the highest data compaction, and are still subject to Java Garbage Collection.
In Hazelcast Enterprise, the High-Density Memory Store is built around a pluggable memory manager which enables multiple memory stores. These memory stores are all accessible using a common access layer that scales up to Terabytes of main memory on a single JVM. At the same time, by further minimizing the GC pressure, High-Density Memory Store enables predictable application scaling and boosts performance and latency while minimizing pauses for Java Garbage Collection.
There are cases where you need to synchronize multiple clusters to the same state. Synchronization of clusters, also known as WAN (Wide Area Network) Replication, is mainly used for replicating stats of different clusters over WAN environments like the Internet.
Hazelcast members expose various management beans which include statistics about distributed data structures and the states of Hazelcast node internals. The metrics are local to the nodes, i.e. they do not reflect cluster wide values. The JMX API allows you to access these metrics.
You can gather various statistics from your distributed data structures via Statistics API. Since the data structures are distributed in the cluster, the Statistics API provides statistics for the local portion (1/Number of Nodes) of data on each node. You can gather the following statistics:
Clustered JMX via Management Center allows you to monitor clustered statistics of distributed objects from a JMX interface. You can use jconsole or any other JMX client to monitor your Hazelcast Cluster. Use the Clustered JMX interface to integrate Hazelcast Management Center with New Relic and AppDynamics.
For Hazelcast Enterprise, the Clustered REST API is exposed from Management Center to allow you to monitor clustered statistics of distributed objects. To enable Clustered REST on your Management Center, you need only pass a system property at startup.
A Memcache client written in any language can talk directly to a Hazelcast cluster. No additional configuration is required. (Hazelcast Memcache Client only supports ASCII protocol. Binary Protocol is not supported.)
Hazelcast’s new client-server protocol now supports versioning and easy client implementation. This provides enterprises deployment and upgrade flexibility by allowing clients to be upgraded independently of servers. Caching services may be deployed and upgraded enterprise-wide, without forcing clients across business units to upgrade in lock step.
The accompanying protocol documentation and client implementation guide also allows clients to be easily implemented in any platform. The implementation guide ships with a Python reference implementation.
The Clustered REST API is exposed from Management Center to allow you to monitor clustered statistics of distributed objects.
Hazelcast allows you to intercept socket connections before a node joins to cluster or a client connects to a node. This provides the ability to add custom hooks to join and perform connection procedures (like identity checking using Kerberos, etc.).
Hazelcast allows you to encrypt the entire socket level communication among all Hazelcast members. Encryption is based on Java Cryptography Architecture. In symmetric encryption, each node uses the same key, so the key is shared.
The authentication mechanism for Hazelcast Client security works the same as cluster member authentication. To implement client authentication, configure a Credential and one or more LoginModules. The client side does not have and does not need a factory object to create Credentials objects like ICredentialsFactory. Credentials must be created at the client side and sent to the connected node during the connection process.
Hazelcast client authorization is configured by a client permission policy. Hazelcast has a default permission policy implementation that uses permission configurations defined in the Hazelcast security configuration. Default policy permission checks are done against instance types (map, queue, etc.), instance names (map, queue, name, etc.), instance actions (put, read, remove, add, etc.), client endpoint addresses, and client principal defined by the Credentials object. Instance and principal names and endpoint addresses can be defined as wildcards(*).
Hazelcast has an extensible, JAAS based security feature you can use to authenticate both cluster members and clients, and to perform access control checks on client operations. Access control can be done according to endpoint principal and/or endpoint address.
Hazelcast allows you to intercept every remote operation executed by the client. This lets you add a very flexible custom security logic.
You can use the native .NET client to connect to Hazelcast nodes. All you need is to add HazelcastClient3x.dll into your .NET project references. The API is very so the Java native client.
You can use Native C++ Client to connect to Hazelcast nodes and perform almost all operations that a node can perform. Clients differ from nodes in that clients do not hold data. The C++ Client is by default a smart client, i.e. it knows where the data is and asks directly for the correct node. The features of C++ Clients are:
Native Clients (Java, C#, C++) enable you to perform almost all Hazelcast operations without being a member of the cluster. It connects to one of the cluster members and delegates all cluster wide operations to it (dummy client), or it connects to all of them and delegates operations smartly (smart client). When the relied cluster member dies, the client will transparently switch to another live member.
The Java client is the most full featured client. It is offered both with Hazelcast and Hazelcast Enterprise. The main idea behind the Java client is to provide the same Hazelcast functionality by proxying each operation through a Hazelcast node. It can access and change distributed data, and it can listen to distributed events of an already established Hazelcast cluster from another Java application.
Near Cache allows a subset of data to be cached locally in memory on the Java client.
As an alternative to the existing serialization methods, Hazelcast offers a language/platform independent Portable serialization that has the following advantages:
Portable serialization is totally language independent and is used as the binary protocol between Hazelcast server and clients.
You need to serialize the Java objects that you put into Hazelcast because Hazelcast is a distributed system. The data and its replicas are stored in different partitions on multiple nodes. The data you need may not be present on the local machine, and in that case, Hazelcast retrieves that data from another machine. This requires serialization.
Hazelcast serializes all your objects into an instance of com.hazelcast.nio.serialization.Data. Data is the binary representation of an object. Serialization is used when:
The Python client is the reference implementation of the new Hazelcast Client Binary Protocol. Hazelcast’s robust In-Memory Data Grid is now available to Python applications.
You can use the Hazelcast Node.js client to connect to Hazelcast nodes. You can install the client via Node Package Manager (npm).
The Scala API for Hazelcast is a “soft” API, i.e. it expands the Java API rather than replace it. The Scala API also adds built-in distributed aggregations, and IMap join capability.
Hazelcast Apache Spark Connector allows Hazelcast Maps and Caches to be used as shared RDD caches by Spark using the Spark RDD API. Both Java and Scala Spark APIs are supported.
Hazelcast Mesos Integration module gives you the ability to deploy Hazelcast on the Mesos cluster. Since it depends on Hazelcast Zookeeper module for discovery, the deployed version of Hazelcast on Mesos cluster should not be lesser than 3.6.