OpenDistro Multi-node ElasticSearch Cluster

3 min readJul 1, 2020

If you’re like me, you have arrived here after googling configuration examples of a self-hosted ElasticSearch (ES) cluster. Here you go: https://github.com/kave/opendistro-elasticsearch

Still, reading? Good let’s answer a few questions:

Why OpenDistro?

ElasticSearch started as a fully open source project, but over time (and popularity) started putting features behind a proprietary paywall. Hence OpenDistro. You can read more about this here:

Since this is backed by AWS and is used in their ElasticSearch service, it gave me confidence that when the option to move to the cloud became available the migration should be simpler

Multi-Node Cluster Configuration Walkthrough

We’re going with a multi-node setup because by distributing the documents in an index across multiple shards and distributing those shards across multiple nodes, Elasticsearch can ensure redundancy, which both protects against hardware failures and increases query capacity as nodes are added to a cluster. As the cluster grows (or shrinks), Elasticsearch automatically migrates shards to rebalance the cluster

The repo’s docker-compose configuration will create 1 master & 2 data persisted nodes communicating between containers via TLS.

Steps:

Clone repo: https://github.com/kave/opendistro-elasticsearch

2. Create a local certificate authority and unique private keys & certificates for each node. I’ve simplified this by providing a script to create local self-signed certificates

cd certs
./ssl_creation.sh

3. Add the root authority certificate (root.pem) to your KeyChain. I’ll provide macOS Keychain instructions below:

Open the macOS Keychain app
Go to File > Import Items..
Select your root certificate file (i.e. root.pem)

Double click on your root certificate in the list
Expand the Trust section
Change the When using this certificate: select box to Always Trust

4. Increase docker memory to 6GB. I’ve currently set each node to have 2GB but feel free to edit this for your use-case.

5. Launch containers. This will bring up the 3 containers,

docker-compose --compatibility up

Done 🙌🏿

Goldblum, Goldbluming

6. Confirm cluster health via the health check endpoint.

From the repo root directory:

curl 'https://admin:admin@local.es:9200/_cluster/health?pretty' --cacert $PWD/certs/root-ca.pem

7. Success! 🚀

Tips for Production

Set Xmx and Xms of your Java heap space to no more than 50% of your physical RAM.

config/cluster.conf:

ES_JAVA_OPTS=-Xms1g -Xmx1g

docker-compose.yml:

deploy:
  resources:
    limits:
      memory: 2G

Ensure ES can create the necessary amount of threads. This can be done by setting ulimit -n 4096 as root before starting Elasticsearch, or by setting nproc -n 4096 in /etc/security/limits.conf.

<app_user>         soft       nproc          4096<app_user>         hard       nproc          4096

Increate VM File Descriptors. set ulimit -n 65535 as root before starting Elasticsearch, or set nofile to 65535 in /etc/security/limits.conf.

<app_user>         soft       nofile          1024<app_user>         hard       nofile          65535

If you are bind-mounting a local directory or file, it must be readable by the elasticsearch user. Also, this user must have write access to the data and log dirs. A good strategy is to grant group access to gid 0 for the local directory. For example, to prepare a local directory for storing data through a bind-mount:

mkdir esdatadir
chmod g+rwx esdatadir
chgrp 0 esdatadir

If possible you want to have your nodes on different machines, but on the same network to improve redundancy and reduce the blast radius of a disaster.

For more tech musings.. Feel free to follow me here.

OpenDistro Multi-node ElasticSearch Cluster

Why OpenDistro?

Multi-Node Cluster Configuration Walkthrough

Tips for Production

Resources

Written by Emmanuel Apau

No responses yet