Get started with Elasticsearch

Dec 1, 2020 10:11 · 735 words · 4 minute read elasticsearch lucene search nosql json

Elasticsearch is a search engine based on the open source library Lucene. It really shines on full-text search but also supports other types of search: geo search, metric search, etc.

Elasticsearch vs Elastic vs ELK

Before getting into the details of Elasticsearch and setting up a single node cluster for testing, let’s clarify a few acronyms and names.

Elasticsearch, the search engine, is a product that gave its name to the company Elasticsearch. In 2015, the company renamed from Elasticsearch to Elastic to clarify the broader range of products that the company started to offer. Indeed, the company started early on exploring problems beyond search. The core products of the Elastic company form the ELK stack.

  • E for Elasticsearch, the search engine
  • L for Logstash, used for cleansing and enriching data (typically logs)
  • K for Kibana, the main visualisation and dashboarding tool for Elasticsearch indexed data

Easy, right? … But Elastic now talks about the Elastic stack as they started adding products to their core stack:

  • XPack which includes many functionalities, security, monitoring, alerting, etc.
  • Beats, lightweight data shippers for gathering and shipping data (e.g. Heartbeat for uptime monitoring)

And we can expect more products and services to be added in the future.

You can use Elasticsearch without using any other product or service in the Elastic stack but they work very well together.

A common use case for the ELK stack is the indexing of logs data. The stack is usually implemented as follows:

  1. Logstash: cleanse the logs
  2. Elasticsearch: index the logs for search
  3. Kibana: visualisations and dashboarding

Elasticsearch installation

OK, now that we have clarified a few things about the Elastic stack, let’s try to install a single node Elasticsearch cluster with Kibana.

The easiest way to play with Elasticsearch and install it is using the Docker image available on Dockerhub and that’s what we are going to demonstrate today.

If you want to go a bit further, you can explore some of the key settings at the following links

Let’s use a docker-compose.yaml file (if you are not too familiar with docker-compose you can check our intro post on docker-compose with Django).

version: "3"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.9.3
    ports:
      - 9200:9200
      - 9300:9300
    volumes:
      - /usr/share/elasticsearch
    environment:
      - discovery.type=single-node
  kibana:
    image: docker.elastic.co/kibana/kibana:7.9.3
    depends_on:
      - elasticsearch
    ports:
      - 5601:5601

Once you have copied the above on a local docker-compose.yaml file, you can run

docker-compose up

It will take a bit of time to pull Elasticsearch and Kibana images. Once that’s all done, you’ll have a single node elasticsearch cluster running in docker.

Elasticsearch exposes a few endpoints to make your life easier. You can start by checking the details of your cluster at http://localhost:9200/.

If you want to explore further, a good starting point is the _cat endpoint available at http://localhost:9200/_cat. It lists the many endpoints available to explore your Elasticsearch cluster.

For instance, http://localhost:9200/_cat/health tells you a bit more about your cluster health. If you check it now, your cluster should be in yellow state because you only have one node in your cluster. We’ll discuss the cluster states in a bit more details later on.

Another useful endpoint is http://localhost:9200/_cat/indices which lists your Elasticsearch indices. There should be no index in the list just yet. Let’s create one to see how to do that.

Creating your first Elasticsearch index and your first document

You can run the following to create your first Elasticsearch index

curl -X PUT http://localhost:9200/my-index-0

You should get a json response similar to the following response

{"acknowledged":true,"shards_acknowledged":true,"index":"my-index-0"}

And if you inspect the _cat/indices endpoint again, your new index should be listed. You can inspect the index documents at http://localhost:9200/my-index-0/_search. You’ll notice that hits["hits"] is an empty list because we haven’t indexed any document yet.

Let’s index a document now

curl -X POST http://localhost:9200/my-index-0/_doc/1 -d '{"user": "test"}' -H 'Content-Type: application/json'

The response should now look like the following

{"_index":"my-index-0","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":0,"_primary_term":1}

confirming that you’ve successfully indexed your first document in Elasticsearch. http://localhost:9200/my-index-0/_search now shows 1 result where the details are nested under the _source object.

The document can also be inspected in more details at http://localhost:9200/my-index-0/_doc/1.

Conclusion

Phew. We’ve now setup Elasticsearch in Docker and created a first index and POSTed a first document. We’ll stop here for now and I’ll see you again in a new post about the Elasticsearch Python client.

References

tweet Share