Evan Tahler


@ 06 Jan 2014

architecture elasticsearch ops node


At TaskRabbit, we use ElasticSearch for a number of things (which include search of course). In our development, we follow the normal pattern of having a few distinct environments which we use to build and test our code. The ‘acceptance’ environment is supposed to be a mirror of production, including having a copy of its data. However, we could not find a good tool to help us copy our Elastic Search indices… so we made one!


elasticdump works by sending an input to an output. Both can be either an elasticsearch URL or a File.

Elasticsearch: - format: {proticol}://{host}:{port}/{index} - example:

File: - format: {FilePath} - example: /Users/evantahler/Desktop/dump.json

You can then do things like:

  • Copy an index from production to staging:
    • elasticdump --input=http://production.es.com:9200/my_index --output=http://staging.es.com:9200/my_index
  • Backup an index to a file:
    • elasticdump --input=http://production.es.com:9200/my_index --output=/var/dat/es.json


  • --input (required) (see above)
  • --output (required) (see above)
  • --limit how many objects to move in bulk per operation (default: 100)
  • --debug display the elasticsearch commands being used (default: false)
  • --delete delete documents one-by-one from the input as they are moved (default: false)


  • elasticdump (and elasticsearch in general) will create indices if they don’t exist upon import
  • we are using the put method to write objects. This means new objects will be created and old objects with the same ID will be updated
  • the file transport will overwrite any existing files
  • If you need basic http auth, you can use it like this: --input=http://name:password@production.es.com:9200/my_index

Inspired by https://github.com/crate/elasticsearch-inout-plugin and https://github.com/jprante/elasticsearch-knapsack

You can download elasticdump from NPM or GitHub


