Get Started

Get Started

These guides demonstrate how to get started quickly with Hazelcast IMDG and Hazelcast Jet.

Hazelcast IMDG

Learn how to store and retrieve data from a distributed key-value store using Hazelcast IMDG. In this guide you’ll learn how to:

  • Create a cluster of 3 members.
  • Start Hazelcast Management Center
  • Add data to the cluster using a sample client in the language of your choice
  • Add and remove some cluster members to demonstrate data balancing capabilities of Hazelcast

Hazelcast Jet

Learn how to build a distributed data processing pipeline in Java using Hazelcast Jet. In this guide you’ll learn how to:

  • Install Hazelcast Jet and form a cluster on your computer
  • Build a simple pipeline that receives a stream of data, does some calculations and outputs some results
  • Submit the pipeline as a job to the cluster and observe the results
  • Scale the cluster up and down while the job is still running

Tech Talk Series

April 01, 2020

Tech conferences and meetups have been canceled or postponed across the world. To make the situation a little bit more pleasing for everybody who misses them, Hazelcast has started a series of virtual tech meetups.

Please join us on Thursdays, starting April 2nd. Always at 3:30pm CET/ 7:30am PDT / 10:30am EDT / 2:30pm GMT.

The list of topics:

Streaming in the world of legacy applications (Vladimir Schreiner)

Date: Thursday, April 2, 2020

Recording: https://youtu.be/LzuRPXUrQZA

A practical introduction to CDC (Change Data Capture). Architecture, trade-offs, tooling, and demos.

There are common themes when people describe their reasons for rearchitecting legacy business applications at a technical level: Speed & Scalability. At a business level: The need to gain new real-time insights. These legacy applications commonly center around some central datastore, such as a relational database. Moving away from this architecture requires massive migration effort. The costs and risks associated with such an effort can sometimes be prohibitive for business owners, you can’t just rip out your relational database.    

A lower risk, gradual transition to a target architecture, often wins the day. Streaming, Caching & CDC technologies are vital tools for this journey. CDC (Change Data Capture) can turn your legacy data stores into streaming sources. Modern caching technologies can host data in a way that provides speed and scalability, and finally, streaming acts as the glue that can drive new use cases as well as bridging the old.    

Machine Learning at Scale using distributed stream processing (Marko Topolnik)

Date: Thursday, April 9, 2020

Recording: https://youtu.be/acDl6_c44ro

The capabilities of machine learning are now pretty well understood, and there are great tools to do data science and construct models that answer nontrivial questions about your data. These tools are mostly used in Python.

The key new challenge is making the trained prediction model usable in real time, while the user is interacting with your software. Getting answers from an ML model (this is called inference) takes a lot of CPU and must be done at serious scale. The ML tools are optimized mainly for batch-processing a lot of data at once, and often the implementations aren’t parallelized.

In this talk, I will show an approach that allows you to write a low-latency, auto-parallelized, and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist’s work taken in almost unchanged form from their Python development environment.

The talk includes a live demo using the command line and going through some Python and Java code snippets.

3 Easy Improvements in Your Microservices Architecture (Nicolas Frankel)

Date: Thursday, April 16, 2020

Recording: https://youtu.be/snR2JpTTX4I

While a microservices architecture is more scalable than a monolith, it has a direct hit on performance.

To cope with that, one performance improvement is to set up a cache. It can be configured for database access, for REST calls or just to store session state across a cluster of server nodes. In this demo-based talk, I’ll show how Hazelcast In-Memory Data Grid can help you in each one of those areas and how to configure it. Hint: it’s much easier than one would expect.

Distributed Snapshots (Viliam Ďurina)

Date: Thursday, April 23, 2020

Recording: https://youtu.be/z5XspIKOI4I

Having fault-tolerance can be a factor in choosing a distributed system even if a single machine can handle the expected load – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics? 

I’ll describe the Chandy-Lamport algorithm that can be used to snapshot the global state of a distributed system consistently. I’ll also describe its particular simplified case that’s used in Jet.

Advanced Kubernetes: Lesson Learned From Building a Managed Service (Hüseyin BABAL)

Date: Thursday, May 7, 2020

Recording: https://youtu.be/qPPe7O5KvI8

In this session, I will mention how to create a multi-tenant environment on Kubernetes to build a managed service.
I will provide golden rules of building managed service on top of Kubernetes with real-life examples as I gained experience during Hazelcast Cloud development:

  • Environment isolation
  • Microservice Architecture
  • Monitoring
  • Logging
  • Tracing

Embedded Time Series Storage: A Cookbook (Andrey Pechkurov)

Date: Thursday, May 21, 2020

Recently Hazelcast Management Center team had to build an embedded Java time series storage on top of existing well-known components. In this (very) practical talk we are going to discuss technical challenges and design decisions made during the process. The talk should be helpful for those who want to learn more about time series storages and databases.

About the Author

About the Author

Vladimir Schreiner

Vladimir Schreiner

Vladimir is a product manager with an engineering background and deep expertise in stream processing and real-time data pipelines. Ten years of building internal software platforms and development infrastructure have made him passionate about new technologies and finding ways to simplify data processing. Vladimir co-authored two white papers on the topic: Understanding Stream Processing: Fast Processing of Infinite and Big Data, and A Reference Guide to Stream Processing. His tutorial video on stream processing and real-time data pipelines discusses the building blocks of a stream processing pipeline and demonstrates how developers can write a full-blown streaming pipeline in less than a hundred lines of Java code for a variety of applications. Vladimir is also a lecturer with the Czechitas Foundation, whose mission is to inspire women and girls to explore the world of information technology. Czechitas Foundation teaches coding in various programming languages, software testing, and data analysis.

Latest Blogs

How Hazelcast Jet Compares to Apache Spark

Don’t over-centralize your Kafka infrastructure

View all blogs by the author
Open Gitter Chat