Get Started

Get Started

These guides demonstrate how to get started quickly with Hazelcast IMDG and Hazelcast Jet.

Hazelcast IMDG

Learn how to store and retrieve data from a distributed key-value store using Hazelcast IMDG. In this guide you’ll learn how to:

  • Create a cluster of 3 members.
  • Start Hazelcast Management Center
  • Add data to the cluster using a sample client in the language of your choice
  • Add and remove some cluster members to demonstrate data balancing capabilities of Hazelcast

Hazelcast Jet

Learn how to build a distributed data processing pipeline in Java using Hazelcast Jet. In this guide you’ll learn how to:

  • Install Hazelcast Jet and form a cluster on your computer
  • Build a simple pipeline that receives a stream of data, does some calculations and outputs some results
  • Submit the pipeline as a job to the cluster and observe the results
  • Scale the cluster up and down while the job is still running

Open Source Storage and Computing at In-Memory Speeds

Use Hazelcast IMDG to store your data in RAM, spread and replicate it across a cluster of machines, and perform data-local computation on it. Replication gives you resilience to failures of cluster nodes.

Use Hazelcast Jet to build data pipelines processing streams of events such as from message queues and database changelogs. The processing state is replicated, allowing you to scale up and down the computation without any loss of data.

Hazelcast IMDG

Open-source distributed In-memory object store supporting a wide variety of data structures such as Map, Set, List, MultiMap, RingBuffer, HyperLogLog. Cloud and Kubernetes friendly.

Hazelcast Jet

Open-source distributed stream and batch processing with embedded in-memory storage and a variety of connectors such as Kafka, Amazon S3, Hadoop, JMS and JDBC.

// Start the Embedded Hazelcast Cluster Member.
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
 // Get the Distributed Map from Cluster.
IMap map = hz.getMap("my-distributed-map");
//Standard Put and Get.
map.put("key", "value");
map.get("key");
//Concurrent Map methods, optimistic updating
map.putIfAbsent("somekey", "somevalue");
map.replace("key", "value", "newvalue");
// Shutdown the Hazelcast Cluster Member
hz.shutdown();

Distributed Map

Distributed Map is the most widely use data structure within Hazelcast IMDG. You can store objects directly from your application and get them back using the key or via SQL like queries. Everything is stored in memory with replicas spread across the cluster. Adding cluster members expands the space available for data. The example you can see here is in Java, but the API is similar across other languages. The Distributed Map can also recognise JSON values and allows querying on its elements.

IMap<String,Double> averagePrices = jet.getMap("current-avg-trade-price");

Pipeline p = Pipeline.create();
// Stream (trade symbol, price) records from Kafka.
p.readFrom(KafkaSources.kafka("trades", props))
 .withTimestamps(Trade::getTime, 0L)
 .filter(trade -> STOCKS.contains(trade.getSymbol()))
 .groupingKey(Trade::getSymbol)
 // 10 second sliding window, updated every 100ms
 .window(WindowDefinition.sliding(10_000, 100))
 .aggregate(AggregateOperations.averagingLong(Trade::getPrice)) 
 // write results to a distributed map.
 .map(window -> Util.entry(window.getKey(), window.getValue()))
 .writeTo(Sinks.map(averagePrices));

Stream Processing

Hazelcast Jet can apply continuous transforms to a stream of data, such as filtering, mapping or aggregation over windows or joining multiple data sources. It can deal with events arriving out of order or can be used for detecting patterns in an event stream. Jet supports many different data sources and sinks, such as Apache Kafka, message brokers, relational databases, Amazon S3, Hadoop and its own built in distributed map structure.

<map name="customers">
    <backup-count>1</backup-count>
    <eviction eviction-policy="NONE" 
              max-size-policy="PER_NODE" size="0"/>
    <map-store enabled="true" initial-mode="LAZY">
        <class-name>com.examples.DummyStore</class-name>
        <write-delay-seconds>60</write-delay-seconds>
        <write-batch-size>1000</write-batch-size>
        <write-coalescing>true</write-coalescing>
        <properties>
           <property name="jdbc_url">my.jdbc.com</property>
        </properties>
    </map-store>
</map-store>

Database Caching

Use Hazelcast IMDG to speed-up applications that read and write to disks, such as relational databases and NoSQL stores. Hazelcast IMDG supports several cache patterns such as Read-Through, Write-Through, Write-Behind & Cache-Aside. Using the first three patterns the application need not know anything about the backing stores, they just deal with data structure APIs such as Map. Write-Behind solves the problem of slow data stores where the application would usually wait for an acknowledgment. The following example shows cluster configuration for a Write-Behind store.

IMap<Integer, Product> productMap = jet.getMap(PRODUCTS);
Pipeline p = Pipeline.create();
// read a list of trades, join with the product map and then write back to files
p.readFrom(Sources.files("trades"))
 .mapUsingIMap(productMap, 
       trade -> trade.productId(), 
       (t, product) -> tuple2(t, product.name())
 ).writeTo(Sinks.files("joined"));

Distributed Compute

Use Hazelcast Jet to speed up your MapReduce, Spark, or custom Java data processing jobs. Load data sets to a cluster cache and perform compute jobs on top of the cached data. You get significant performance gains by combining an in-memory approach and co-location of jobs and data with parallel execution.

Why Hazelcast?

Build Distributed Applications

Hazelcast provides tools for building distributed applications. Use Hazelcast IMDG for distributed coordination and in-memory data storage and Hazelcast Jet for building streaming data pipelines. Using Hazelcast allows developers to focus on solving problems rather than data plumbing.

Create a Cluster within Seconds

It’s easy to get started with Hazelcast. The nodes automatically discover each other to form a cluster, both in a cloud environment and on your laptop. This is great for quick testing and simplifies deployment and maintenance. No additional dependencies.

Store Data In-Memory Resiliently

Hazelcast automatically partitions and replicates data in the cluster and tolerates node failures. You can add new nodes to increase storage capacity immediately. You can use it as a cache or to store transactional state and perform data-local computations or queries. As all data is stored in memory, you can access it in sub-millisecond latencies. Clients for Java, Python, .NET, C++ and Go are available.

Build Fault-Tolerant Data Pipelines

Use Hazelcast Jet to build massively parallel data pipelines. You can process data using a rich library of transforms such as windowing, joins and aggregations. Jet keeps processing data without loss even when a node fails, and as soon as you add another node, it starts sharing the computation load. First-class support for Apache Kafka, Hadoop and many other data sources and sinks.

Easy Distributed Coordination

Hazelcast has a full implementation of Raft, allowing a simple API for building linearizable distributed systems. Use tools like FencedLock, Semaphore and AtomicReference to simplify coordination between distributed applications.

Single Binary

Both Hazelcast Jet and Hazelcast IMDG are a single Java archive (JAR) less than 15MB. It’s lightweight enough to run on small devices, you can embed it into your application as just another dependency or deploy it as a standalone cluster. First-class support for Kubernetes is included.

Who is using Hazelcast?

Hazelcast is deployed in the most demanding environments and applications.

Guides

Compare Redis with Hazelcast

Redis and Hazelcast solve many similar use cases, most commonly caching. They are quite different in how they approach things such as cache patterns, clustering & querying.

Build Cloud-Native Microservices

Set up a Hazelcast cluster in Kubernetes, and make use of Hazelcast storage and messaging capabilities in your microservices architectures.

Process Event From Apache Kafka

Use Hazelcast Jet to build a data processing pipeline that will process events from Apache Kafka as they arrive.

Use Distributed Data Structures

Use Hazelcast IMDG for storing and retrieving data from distributed in-memory data structures. You can store your data from one machine and access it from another or perform queries on it.

Free Hazelcast Online Training

Whether you're interested in learning the basics of in-memory systems, or you're looking for advanced, real-world production examples and best practices, we've got you covered for FREE!

Open Gitter Chat