A complete guide to Integrating MongoDB with Elastic Search

After almost two weeks and several re-installs and fresh installs, I finally got to integrate mongodb and elastic search. Here is a step by step procedure on how to integrate them.

If you follow this procedure carefully,  it will prevent errors like

Exception: java.lang.NoSuchMethodError: com.mongodb.Mongo.fsyncAndLock()

{
“error” : “IndexMissingException[[testmongo] missing]”,
“status” : 404
}

Follow these guide to install MonogDB and Elastic Search

After you installed them, its time to install elastic search river.

Download the snapshot from here. Extract it and copy its contents to elastic_search_root/plugins/plugins/mongodb_river

Note: The initial implementation tutorial give on the git page points to a older version of the snapshot and it doesn’t work with the latest versions of elastic search and mongodb.

Installing the mongodb river

Run the following two commands to install the mongodb river. If you are on a slow connection, the first command can take more than 15 minutes.

ES_HOME/bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/1.4.0 
ES_HOME/bin/plugin -install richardwilly98/elasticsearch-river-mongodb/1.4.0 

After you install both of them, restart elasticsearch.

ES_HOME/bin/service/elasticsearch restart

Enable replica sets in mongodb by following this tutorial

Tell elastic search to index the “person” colletion in testmongo database by issuing the following command in your terminal

curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ 
    "type": "mongodb", 
    "mongodb": { 
        "db": "testmongo", 
        "collection": "person"
    }, 
    "index": {
        "name": "mongoindex", 
        "type": "person" 
    }
}'

add some data to the mongodb through mongo terminal

use testmongo
var p = {firstName: "John", lastName: "Doe"}
db.person.save(p)

Use this command to search the data

curl -XGET 'http://localhost:9200/mongoindex/_search?q=firstName:John'

12 Replies to “A complete guide to Integrating MongoDB with Elastic Search”

  1. Hi Satish,

    With your help I am able to get rid of 404 exception. However, all my queries return NO hits. Following are the steps I have followed in sequence. Could you please help?

    1. Installed MongoDB 2.4.3
    2. Enabled ReplicaSet = rs0
    3. Enabled oplogSize = 100
    4. Restarted MongoDB Server
    4. Configured the rsConf variable on MongoShell
    5. Initiated the ReplicaSet
    ***************************************************************************
    rs0:PRIMARY> rs.status()
    {
    “set” : “rs0”,
    “date” : ISODate(“2013-05-21T17:42:41Z”),
    “myState” : 1,
    “members” : [
    {
    “_id” : 0,
    “name” : “127.0.0.1:27017”,
    “health” : 1,
    “state” : 1,
    “stateStr” : “PRIMARY”,
    “uptime” : 865,
    “optime” : {
    “t” : 1369157994,
    “i” : 1
    },
    “optimeDate” : ISODate(“2013-05-21T17:39:54Z”),
    “self” : true
    }
    ],
    “ok” : 1
    }
    *************************************************************************
    5. Install ES
    6. Install attachment plugin
    7. Install River plugin
    8. Restart ES
    9. Curl ES to index documents from MongoDB with following command
    9. Added new documents to MongoDB
    10. on MongDB db.oplog.rs.find() returns data as below
    **************************************************************************
    rs0:PRIMARY> db.oplog.rs.find()
    { “ts” : { “t” : 1369152540, “i” : 1 }, “h” : NumberLong(0), “v” : 2, “op” : “n”, “ns” : “”, “o” : { “msg” : “initiating set” } }
    { “ts” : { “t” : 1369152706, “i” : 1 }, “h” : NumberLong(“7734203254950592529”), “v” : 2, “op” : “i”, “ns” : “players.scores”, “o” : { “_id” : ObjectId(“519b9cc2871b3116f0b8006e”), “name” : “rohit”, “score” : 20 } }
    { “ts” : { “t” : 1369157720, “i” : 1 }, “h” : NumberLong(“2546253279621321716”), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb058b6fd31855ec8b0af”), “name” : “karthik”, “score” : 20 } }
    { “ts” : { “t” : 1369157994, “i” : 1 }, “h” : NumberLong(“-3356531630527802451”), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb16ab6fd31855ec8b0b0”), “name” : “dinesh”, “score” : 20 } }
    ***************************************************************************
    11. Searching on ES for newly added document to scores collection on MongoDB
    does not return any results.

    I have tried posting document directly to ES and it does return the results but not from the documents in MongoDB. Appeciate your help in advance.

    Thank You

  2. Thanks for the great tutorial, i had to just download the latest versions of the mapper and river plugin and it all worked 🙂

  3. simple question.. probably a stupid question..
    The curl command to index it doesnt seem to have the hostname of the mongo replicaset master – Does this article assume that the mongo replicaset is local to the elasticsearch server?
    If so, how to indicate to the elasticsearch server the hostname of mongodb?

  4. Ok.. as a response to my dumb question,, I found this curl command from https://github.com/richardwilly98/elasticsearch-river-mongodb/wiki
    Thanks!!

    $ curl -XPUT “localhost:9200/_river/${es.river.name}/_meta” -d ‘
    {
    “type”: “mongodb”,
    “mongodb”: {
    “servers”:
    [
    { “host”: ${mongo.instance1.host}, “port”: ${mongo.instance1.port} },
    { “host”: ${mongo.instance2.host}, “port”: ${mongo.instance2.port} }
    ],
    “options”: {
    “secondary_read_preference” : true,
    “drop_collection”: ${mongo.drop.collection},
    “exclude_fields”: ${mongo.exclude.fields},
    “include_fields”: ${mongo.include.fields},
    “include_collection”: ${mongo.include.collection},
    “import_all_collections”: ${mongo.import.all.collections},
    “initial_timestamp”: {
    “script_type”: ${mongo.initial.timestamp.script.type},
    “script”: ${mongo.initial.timestamp.script}
    },
    “skip_initial_import” : ${mongo.skip.initial.import},
    “store_statistics” : ${mongo.store.statistics},
    },
    “credentials”:
    [
    { “db”: “local”, “user”: ${mongo.local.user}, “password”: ${mongo.local.password} },
    { “db”: “admin”, “user”: ${mongo.db.user}, “password”: ${mongo.db.password} }
    ],
    “db”: ${mongo.db.name},
    “collection”: ${mongo.collection.name},
    “gridfs”: ${mongo.is.gridfs.collection},
    “filter”: ${mongo.filter}
    },
    “index”: {
    “name”: ${es.index.name},
    “throttle_size”: ${es.throttle.size},
    “bulk_size”: ${es.bulk.size},
    “type”: ${es.type.name}
    “bulk”: {
    “actions”: ${es.bulk.actions},
    “size”: ${es.bulk.size},
    “concurrent_requests”: ${es.bulk.concurrent.requests},
    “flush_interval”: ${es.bulk.flush.interval}
    }
    }
    }’

  5. hello

    I was trying to integrate Mongodb with elasticsearch. After installing both the plugins i restated ES and also the mongodb. Later when i tried to execute the below code on command line

    curl -XPUT ‘http://localhost:9200/_river/mongodb/_meta’ -d ‘{
    “type”: “mongodb”,
    “mongodb”: {
    “db”: “testmongo”,
    “collection”: “person”
    },
    “index”: {
    “name”: “mongoindex”,
    “type”: “person”
    }
    }’

    It gave me

    {“_index”:”_river”,”_type”:”mongodb”,”_id”:”_meta”,”_version”:4,”created”:false}

    Please help

    thank you

    1. hi Jayesh..
      Index is already created with the name of monogindex.. so if u try to create it again, it increments the version number for every updation on same index for optimistic concurrency control.

  6. hi Jayesh..
    Index is already created with the name of monogindex.. so if u try to create it again, it increments the version number for every updation on same index for optimistic concurrency control.

  7. Hi,

    I have configured as you defined, but ES only pulls the ID column, it does not pull other column. Also, the ID column appears in ES as index ID? Any idea?

    Senthamarai

  8. Hey if i dont want to use oplog what is the other way to solve it

    please update asap
    Thanks With Regards

  9. Hi All,
    Can you please tell me where are you giving this curl command. In Mongo shell or through the system terminal.
    Thanks a lot in advance.

  10. Hi All,
    Sorry to disturb you all with my stupid question.
    I cam to know that we can use restClient,curl or postman to execute these RestAPI’s.

Leave a Reply

Your email address will not be published. Required fields are marked *