Data import options with Elasticsearch

Posted by Hariharan Vadivelu on
Architecturally there are two approaches for dataload, at the outset, you will have to decide between "push" vs "pull" model based on your requirements and performance goals, in this article we will explore ES dataload options for both of these categories.

I have sourced much of this information from ES mailing group , in fact this is a compilation of everything that I found on the ES mailing list while I was researching on this topic and did not find any tutorial or article that has a comprehensive information on this topic.

Before we jump deep into the topic there are few basic things to remember when it comes to indexing the data in ES,  with ES, the best load performance is with more shards, and best query performance is with more replicas, so you need to find a sweet spot with your setup, In ES all indexing goes through the primary shards , it is important that you follow an iterative approach to data indexing needs to arrive at a sweet spot, don't start with tuning at first place, instead let tuning recommendations trickle down based on what you learn from your setup and do remember that It takes significantly more time to index on an existing index than on an empty index

Pull Model

River plugin 

These are built as custom plugin code that can be deployed within ES and runs within the ES node, they are a good fit when you are expecting a constant flow of data that needs to be indexed and you don't want to write another external application to push data into ES for indexing. a very good use case if when you are indexing analytics and server logs or data coming out of nosql store like cassendra or mongodb.

River plugins also support import using Bulk API, this is useful in cases where the river plugin
can accumulate the data for certain threshold before performing an import / indexing, since the client is running within the ES node it is cluster aware.

Push Model

curl -XPUT
This is perhaps the simplest way to index a document, you just perform a PUT on a REST endpoint,
this works best during during development phase to index documents for performing few quick validations from command line.
curl -XPOST '' -d '{"partnumber":"HLG028_281201","name":"Modern Houseware Hanging Lamp","shortdescription":"A red hanging lamp in a triangular shape.","longdescription":"A hanging lamp with red ambient shades to add a romantic mood to your room. Perfect for your bedroom or your children's room. Easy set up so you do not have to pay electricians to set it up."}'

Connectionless datagram protocol. This is faster but not so reliable as you don't have any acknowledgement of success or failure. 
E.g. cat bulk.txt | nc -w 0 -u localhost 9700
if you have an external application that consolidates the data in a timely manner
and then formats it to JSON to be indexed. This is much more reliable as compared to UDP bulk import as you get an acknowledgement of index operation and can take corrective steps based on the response.

Java TransportClient bulk indexing 

Can be used within a custom ETL load that runs outside of ES nodes, you can connect to ES node from a remote host, you can index with multiple threads it saves a bit of HTTP overhead by using the native ES protocol, Bulk is always best as it would try and group the requests per shard and minimize the network round trips, Transport Client is thread safe and it is built to be reused by several threads, while doing bulk load coding do ensure you do not create Transport client in a loop, instead send all the requests through one TransportClient instance per JVM, perhaps create TransportClient as a singleton.

Internally the Transport client sends each request asynchronously and is thread safe
Another nice thing about using a Transportclient is that it will automatically internally round robin to a ES node, and then that node will spread the bulk requests to the respective "shard bulks"

Here is a sample snippet that can be used for connecting to the ES cluster.

   ImmutableSettings.Builder clientSettings = ImmutableSettings.settingsBuilder()
              .put("http.enabled", "false")
              .put("discovery.zen.minimum_master_nodes", 1)
              .put("", 4)
              .put("discovery.zen.ping_timeout", 100)
              .put("discovery.zen.fd.ping_timeout", 300)
              .put("discovery.zen.fd.ping_interval", 5)
              .put("discovery.zen.fd.ping_retries", 5)
              .put("client.transport.ping_timeout", "10s")
              .put("multicast.enabled", false)
              .put("", esHosts)
              .put("", esClusterName)
              .put("index.refresh_interval", "10") //change refresh interval to a higher value
              .put("index.merge.async", true); //change index merge to async

Here is a sample code for creating ES client and using bulk load API for indexing.

 TransportClient client = new TransportClient( );
 List<TransportAddress> addresses = new LinkedList<TransportAddress>();
 //Add one or more ES address and port
 InetSocketTransportAddress address = new InetSocketTransportAddress("<ES_IP>)",Integer.parseInt("<ES_PORT>"));
 TransportAddress[] taddresses = addresses.toArray(new TransportAddress[addresses.size()]);

// Create initial bulk request builder
BulkRequestBuilder bulkRequest = client.prepareBulk();
IndexRequestBuilder indexRequestBuilder = esLoader.getClient().prepareIndex("<ES_INDEX_NAME>", "regular");
//Build the JSON content using XContentBuilder
BulkResponse bulkResponse = bulkRequest.execute().actionGet()

if (bulkResponse.hasFailures()) {
   "Failed to send all requests in bulk " + bulkResponse.buildFailureMessage());
                return true;
            else {
   "Elasticsearch Index updated in {} ms.", bulkResponse.getTookInMillis());

Performance Tuning

1. Start with tuning the index refresh rate at the time of bulk indexing, While importing large amount of data it is recommended to disable refresh interval by setting to a value of -1, you can then refresh the index programmatically towards the end of the load.

You can define index refresh rate at global level by defining in config/elasticsearch.yml or at index level
a value of -1 will suppress it or you can set to any positive integer value based on your requirements of index refresh.
curl -XPUT localhost:9200/test/_settings -d '{
    "index" : {
        "refresh_interval" : "-1"
    } }'
2. You can decrease the bulk thread pool size,Thread pool size should be carefully tuned, under most circumstances defaults are good enough, but you can tune these based on your application requirements, for instance if you are expecting data to flow into the index all the time you can think of adding more thread pools for bulk index operation.
Always remember this rule of thumb, every thread eats up system resources, and try to match it with number of cores.

# Search pool fixed 3 100

# Bulk pool
threadpool.bulk.type: fixed
threadpool.bulk.size: 2
threadpool.bulk.queue_size: 300

# Index pool
threadpool.index.type: fixed
threadpool.index.size: 2
threadpool.index.queue_size: 100
3. if you want both - max perf on load and max perf on search - you should  use two indexes, one for the old generation and one for new generation, and connect them with an index alias. Distribute the indexes over the nodes so they form two separated groups, that is, so they use different machines (for example, by shard moving, shard allocation). Set replica level to 0 (no replicas) for the new gen index. Forward search only to those nodes with the old gen. After bulk is complete, add replica level to new gen, and switch from old to new with the help of index alias (or by just dropping the old gen). You may see a perf hit when replicas are building up but this is not much compared to bulk load.

4. One of the simplest and most effective strategy is to simply start with a no replica index. And once indexing is done, increase the number of replicas to the number you want to have. This will reduce the load when indexing.


  1. Good post to learn about options available for integration/ETL with Elasticsearch:

    Few more avaiable

  2. Wonderful article. I like your article

  3. I would like to appreciate this article because it has a lot of info and giving more knowledge to all.

  4. Thanks for sharing good article. Hoping more good post from you. Keep showing your potential.

  5. Out writing resource have had an experience in doing things for a long time, and as such we know how to go about it to fulfill the needs of our customers. There is a challenge in handling things that have to deal with writing for many students, but not with us. We are waiting for you!

  6. University life prepares numerous of college papers for students to make, and each of them can become a real challenge for young person to accept. Fortunately, a modern student has many essay writing assistants which can help with writing, and most of them are so easy to find online!

  7. Chamber of Commerce Revenue Enhancement Strategy Back Dating Memberships best essay writing service in cheap price Magento: Changing E Commerce Worldwide

  8. Here we can read about the data import options with Elasticsearch.Thanks for sharing! I found here a lot of curious things;) And I need to go. My friend will visit me and help write my essay .

  9. I am a student. This article helps me a lot and i have learned many useful things from this.

  10. Advice for Marketers How to Balance Your Article, Blog and Forum Posting Top Research Paper Site Tips On Starting A Blog: An Article So Good Youll Leave Me A Tip

  11. Your article has explore ES dataload for both of the categories. I liked it very much. To write such informative article you can take help from academic essay writing services

  12. I’m impressed with your article on ES dataload options for ‘push’ vs ‘pull’ models. Here’s Resumes.Expert review that was very helpful to me!

  13. It was very nice blog to learn about SAP BASIS. Thanks for sharing.SAP basis

  14. Hi, your blog is very precious but Architecturally there are two approaches for dataload, at the outset, you will have to decide between "push" vs "pull" model based on your requirements and performance goals, in this article we will explore ES dataload options for both of these categories.
    Cheap Dissertation Writing Services


  15. Hi, your diary is extremely precious however Architecturally there area unit 2 approaches for dataload, at the get-go, you'll need to decide between "push" vs "pull" model supported your needs and performance goals, during this article we'll explore Es dataload choices for each of those classes.
    free classified sites in pakistan

  16. Have you been searching for ways to get level of popularity shortly? You merely require to Buy Facebook Followers to become renowned online. buy followers for facebook

  17. Not more delaying to become famed currently. Buy Facebook Followers as a tactic to increase fame and acceptance online in a shorter duration. buy followers for facebook

  18. This post offers some valuable ideas about ES dataload. It was an E-coomerce related blog and we can find some posts about E-commerce here.
    Admission essay writing service

  19. We can see some well written ideas about ES dataload. It will be useful for those who need to know about e-commerce and all. Essay writing service reviews

  20. تتعدد الشركات التي تقدم خدمات ىالتنظيف لاكن لا يمكن ان تكون كلها في نفس مستوي الجوده فان كنت من الباحثين عن جودة الشركه قبل اي شئ اخر فانصحة بزيارة احدي تلك الصفحات
    شركة تنظيف مساجد بالرياض
    شركة تنظيف خزانات بالخرج
    شركة تنظيف بالخرج
    والتي تقدم افضل خدمات التنظيف بالمنزل باعلي مستوي من الكفائه
    شركة تنظيف منازل بالطائف

  21. Posting in your blog is really a matter of style. Here is how to write a good blog post and get the search engines promoting you for free.
    algebra connections chapter 1

  22. I appreciate you people for taking your precious time to give us some insights on data import options with Elasticsearch. This is quite encouraging and helpful. I will keep on visiting your site for some more information. Keep it up!

  23. E-COMMERCE site business is so great who have not so finance, can start at their home.
    As i am doing Send Flowers to Norway
    for your loved ones to make feel them happy.
    Send Flowers Worldwide

  24. Your blog is very informative and great. Its very great read for me because your writing skills is so good and you will write this post in very good manner. Thanks!
    dissertation Writing Service

  25. Thanks for your article! I have been looking for quite a long time and fortunately I read this article! I wish you would continue to have valuable articles like this or more to share with everyone

  26. Summertime means that you and your family, including pets, will be spending more time outdoors. However, the new season can also bring along dangers for our pets, and the last thing you would want is to have an accident. gostream The death of one of your beloved pets along with a summertime pet memorial is not a good summer memory. Here are some tips to help your family stay safe this summer, especially your pets.

  27. E commerce is playing a inevitable role in our daily life. Thanks to remind this article to remind those valuable contribution by them for the smooth living of our.

  28. Today the society has changed a lot. Many technology has invented and thus changed the face of this society. And also the number of shops get reduced and form more E Commercial websites. Great evolution.

  29. This area – some of the time two separate sections – experiences all that you have found amid the written work of the thesis. This area may require muddled measurable examination, or the making of charts and tables to show your information (contingent upon the teach of your work)

  30. Wonderful, what a weblog it is! This website presents helpful information to us, keep it up. Send Gifts To pakistan

  31. will be using the entire collected works of Shakespeare as our example data. In order to make the best use of Kibana you will likely want to apply a mapping to your new index. Let’s create the shakespeare index with the following mapping. Our data will have more fields than this, but these are the ones we want to explicitly map. Specifically we do not want to analyze speaker and play_name. You’ll see why later on.

  32. I got very excited to see these trendy looks. I think all those who are looking of latest trends will really enjoy reading your post. Please provide more information and photos. I am eagerly waiting for your updated post to get it.
    dissertation Writing Service

  33. Great article. i like this article very much. this all information will give you thank you for share this wonderful information.
    thesis writing help

  34. This is quite encouraging and helpful. I will keep on visiting your site for some more information to work on essay master.

  35. Grants are things which help you to pass your training unreservedly and can apply to greater colleges. DissertationPalace

  36. Whiz Cube Company aim is to educate our visitors about Insurance and we are here for your right counseling different categories we have about Insurance and we are here to help and deliver you the exact Information about See Insurance as you need.

  37. Astoundingly entrancing article. All things considered, when there is so significantly another I imagined that it was splendidly confiding in all the more unmistakable character boggling post from you. besides cheap essay writing, have broadened striking ground, I am to an astounding degree fulfilled. This site presents satisfying data to us, keep it up.

  38. In the event that you don't need anything hopeless to transpire, at that point you should gain help with exposition as quickly as time permits. This is your exclusive shot in the event that you wish to manage a debilitating paper easily and comfort.Buy a Dissertation

  39. Our journalists are especially accustomed in composing even the most compacted papers with unmatched phonetic utilize and stream of the substance. The best thing is your author will work your paper in which he/she has grabbed picking up top to bottom learning. Do my Essay UK

  40. Given the need, plainly every one of the understudies should set themselves up to get stacked with advancing exposition composing and other scholastic or academic requirements and essentials to overcome the term effectively. Do my Essay for me Cheap

  41. While each one of the request are immaculately and instantly clarified here at Writing Victors much to the easing of a huge understudies base far and wide! Composing Victors tries in as a "compose my paper for me" and help understudies.

  42. The investigation will frame the body of the thesis paper, where we write in points of interest every one of the discoveries we ran over. In the conclusion, our scholars will compress the exposition in a short exact passage, trailed by our proposition. We at that point edit, alter and organize the paper according to directions gave by the understudies. Online Dissertation Help

  43. Valuable info. I found your website by accident and I bookmarked it.
    Acquire professional help from specialized essay typer who can deliver high quality wiring assistance for every subject.

  44. Thanks for the sharing. I have an issue. I have repeated all the steps in the same sequence as you have done in your article. I am able to successfully load the excel spreadsheet but when I run Kibana, it does not seem to recognize the index name essay writing services uk e.g.

  45. Our columnists are particularly acclimated in forming even the most compacted papers with unmatched phonetic use and stream of the substance. The best thing is your creator will work your paper in which he/she has gotten grabbing start to finish learning. Essay Help

  46. While every last one of the demand are perfectly and right away illuminated here at Writing Victors much to the facilitating of a colossal understudies base far and wide! Creating Victors tries in as a "form my paper for me" and help understudies.
    cheap essay writing

  47. If you import any kind of data so security is necessary first because its personal and nobody have right to hack them.
    assignment help UK

  48. This is really a nice post, that you have updated us with all of nice information that can be very useful for future... Send Gifts To pakistan

  49. Why should you choose us as your outsourced partner for Amazon Product Listing Services. Well, to begin there is no match to the quality and accuracy level of our work. Our team will come up with customized solutions as per the client’s needs and budget. Mail -, Call- 1-8772841032, 9958399732

  50. This is quite encouraging and helpful. I will keep on visiting your site for some more information to work on essay master.Home cleaning Sydney

  51. Top Album Free Download Now Login

  52. India eData Solutions provide ecommerce product data entry Services on affordable Price, Outsource ecommerce Product Data entry services, We are highly experienced Yahoo Store ecommerce product upload Services for ecommerce stores (Website). our dedicated team of Yahoo Store Specialists we have been managing hundreds of ecommerce bulk product upload.

  53. Hey, you information is much technical. It will remain work, if i create a wedding dresses blog for eCommerce website.?

  54. To get an increased publicity to your facebook account, Folks expend cash to Buy Facebook Followers to easily have publicity online. buy followers for facebook

  55. Hi everyone,I read your post very well.I really appreciate your hard work.Its a really amazing.If you need content writing help you can visit resume writing services in india . Thank You..

  56. Hi ,This is an especially intentionally shaped post, my compliments. I'm astonishing to discover your post. Continue sharing this kind of stuff.Reflective Window film suppliers . thank You...

  57. Hi everyone,This is an especially intentionally shaped post, my compliments. I'm astonishing to discover your post. Continue sharing this kind of stuff. online flower delivery dubai . Thank You...

  58. Hi everyone,Much gratefulness to you for your data, understudies needs specific information to control in this made world. essay writing service reviews . Thank You...

  59. Hi everyone,I extremely got a kick out of looking article. I discovered this as an accommodating and enchanting post, so I think it is particularly productive and fit. I should need to thank you for the exertion you have made in making this article. international colleges in perth

  60. Excellent blog. It looks absolutely competent. This is incredibly wonderful article for me. nice post .. thankful to you for sharing ... interior design consultants in uae . Thank You...

  61. I regard all that you have combined the degree that anyone is concerned base.Much appreciation to you for sharing this article. mobile app development muscat .Thank You...

  62. Are you loking for escort services
    Shweta escort agency will help you find hot call girls in bangalore. No.1 bangalore escort
    service provided by shweta. If you are seeking high profile bangalore call girls sure we will deliver at home & hotel service.

  63. Bangalore escort service is available at an affordable price. We provide the most beautiful call girls and now you can call for Escort services. We provide fully agreed girls in Bangalore. Choosing High Profile College Call Girls In Bangalore Escorts Amazing Love Making Services Of Bangalore Call Girl Escorts they are amazingly wise, rich and exquisite.

  64. Should there be another persuasive post you can share next time, I’ll be surely waiting for it.
    uptu result 2018 odd sem

  65. Tired of not getting your essay up to the quality of a better grade? Try the instructional and research offerings of The Dissertation Help and get all the problems looked after out with the aid of one of the maximum experienced and talented experts supplying all the applicable sources of teachers to enhance up your paintings. We have been supplying 24/7 on line aid to clients. Affordable Dissertation Writing Services Try our premium services on brief time limits with the supply of noticeably top notch work inside a precise time restriction. So, don’t waste your time browsing to other web sites and contact our customer support for in addition information.

  66. How to Setup Facebook Instant Articles on WordPress
    You need to install the official Instant Articles plugin where Facebook will configure some things. You will need a Facebook page for your WordPress, a Facebook app, and at least 10 articles to begin with. You have to visit Facebook Instant Articles website to sign up and claim your URL to prove ownership of your website. After that, add WordPress Instant Articles RSS feed for your website by installing and activating the Instant Articles for WordPress plugin.

  67. Your article has a good content. I enjoyed it in particular. To compose such useful article you can take assistance from here .

  68. The blog was very attractive to read. Thanks for sharing.

    Thai spa in Bhubaneswar

  69. The blog was very convenient and attractive to read. Thanks for sharing an article.

    hair stylist in Bhubaneswar

  70. Good written post. I like reading information like this. Some days ago I came across an article that can be used by students as essay example

  71. This comment has been removed by the author.

  72. Hey nice article. I am searching for this since past 3 hours. Thanks for sharing with us. Good work. Keep it up.