A complete guide to Integrating MongoDB with Elastic Search

After almost two weeks and several re-installs and fresh installs, I finally got to integrate mongodb and elastic search. Here is a step by step procedure on how to integrate them.

If you follow this procedure carefully,  it will prevent errors like

Exception: java.lang.NoSuchMethodError: com.mongodb.Mongo.fsyncAndLock()

{
“error” : “IndexMissingException[[testmongo] missing]”,
“status” : 404
}

Follow these guide to install MonogDB and Elastic Search

After you installed them, its time to install elastic search river.

Download the snapshot from here. Extract it and copy its contents to elastic_search_root/plugins/plugins/mongodb_river

Note: The initial implementation tutorial give on the git page points to a older version of the snapshot and it doesn’t work with the latest versions of elastic search and mongodb.

Installing the mongodb river

Run the following two commands to install the mongodb river. If you are on a slow connection, the first command can take more than 15 minutes.

ES_HOME/bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/1.4.0 
ES_HOME/bin/plugin -install richardwilly98/elasticsearch-river-mongodb/1.4.0 

After you install both of them, restart elasticsearch.

ES_HOME/bin/service/elasticsearch restart

Enable replica sets in mongodb by following this tutorial

Tell elastic search to index the “person” colletion in testmongo database by issuing the following command in your terminal

curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ 
    "type": "mongodb", 
    "mongodb": { 
        "db": "testmongo", 
        "collection": "person"
    }, 
    "index": {
        "name": "mongoindex", 
        "type": "person" 
    }
}'

add some data to the mongodb through mongo terminal

use testmongo
var p = {firstName: "John", lastName: "Doe"}
db.person.save(p)

Use this command to search the data

curl -XGET 'http://localhost:9200/mongoindex/_search?q=firstName:John'
This entry was posted in Elastic Search, MongoDB and tagged , , , . Bookmark the permalink.

5 Responses to A complete guide to Integrating MongoDB with Elastic Search

  1. San says:

    Hi Satish,

    With your help I am able to get rid of 404 exception. However, all my queries return NO hits. Following are the steps I have followed in sequence. Could you please help?

    1. Installed MongoDB 2.4.3
    2. Enabled ReplicaSet = rs0
    3. Enabled oplogSize = 100
    4. Restarted MongoDB Server
    4. Configured the rsConf variable on MongoShell
    5. Initiated the ReplicaSet
    ***************************************************************************
    rs0:PRIMARY> rs.status()
    {
    “set” : “rs0″,
    “date” : ISODate(“2013-05-21T17:42:41Z”),
    “myState” : 1,
    “members” : [
    {
    "_id" : 0,
    "name" : "127.0.0.1:27017",
    "health" : 1,
    "state" : 1,
    "stateStr" : "PRIMARY",
    "uptime" : 865,
    "optime" : {
    "t" : 1369157994,
    "i" : 1
    },
    "optimeDate" : ISODate("2013-05-21T17:39:54Z"),
    "self" : true
    }
    ],
    “ok” : 1
    }
    *************************************************************************
    5. Install ES
    6. Install attachment plugin
    7. Install River plugin
    8. Restart ES
    9. Curl ES to index documents from MongoDB with following command
    9. Added new documents to MongoDB
    10. on MongDB db.oplog.rs.find() returns data as below
    **************************************************************************
    rs0:PRIMARY> db.oplog.rs.find()
    { “ts” : { “t” : 1369152540, “i” : 1 }, “h” : NumberLong(0), “v” : 2, “op” : “n”, “ns” : “”, “o” : { “msg” : “initiating set” } }
    { “ts” : { “t” : 1369152706, “i” : 1 }, “h” : NumberLong(“7734203254950592529″), “v” : 2, “op” : “i”, “ns” : “players.scores”, “o” : { “_id” : ObjectId(“519b9cc2871b3116f0b8006e”), “name” : “rohit”, “score” : 20 } }
    { “ts” : { “t” : 1369157720, “i” : 1 }, “h” : NumberLong(“2546253279621321716″), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb058b6fd31855ec8b0af”), “name” : “karthik”, “score” : 20 } }
    { “ts” : { “t” : 1369157994, “i” : 1 }, “h” : NumberLong(“-3356531630527802451″), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb16ab6fd31855ec8b0b0″), “name” : “dinesh”, “score” : 20 } }
    ***************************************************************************
    11. Searching on ES for newly added document to scores collection on MongoDB
    does not return any results.

    I have tried posting document directly to ES and it does return the results but not from the documents in MongoDB. Appeciate your help in advance.

    Thank You

  2. Wez says:

    Thanks for the great tutorial, i had to just download the latest versions of the mapper and river plugin and it all worked :)

  3. Anibal says:

    Everybody, PAY ATTENTION with VERSIONS !!!
    If you use another version it doesn’t work….
    https://github.com/richardwilly98/elasticsearch-river-mongodb (check correct version of river, elasticsearch and mongo )

  4. vpjaiganesh says:

    simple question.. probably a stupid question..
    The curl command to index it doesnt seem to have the hostname of the mongo replicaset master – Does this article assume that the mongo replicaset is local to the elasticsearch server?
    If so, how to indicate to the elasticsearch server the hostname of mongodb?

  5. vpjaiganesh says:

    Ok.. as a response to my dumb question,, I found this curl command from https://github.com/richardwilly98/elasticsearch-river-mongodb/wiki
    Thanks!!

    $ curl -XPUT “localhost:9200/_river/${es.river.name}/_meta” -d ‘
    {
    “type”: “mongodb”,
    “mongodb”: {
    “servers”:
    [
    { "host": ${mongo.instance1.host}, "port": ${mongo.instance1.port} },
    { "host": ${mongo.instance2.host}, "port": ${mongo.instance2.port} }
    ],
    “options”: {
    “secondary_read_preference” : true,
    “drop_collection”: ${mongo.drop.collection},
    “exclude_fields”: ${mongo.exclude.fields},
    “include_fields”: ${mongo.include.fields},
    “include_collection”: ${mongo.include.collection},
    “import_all_collections”: ${mongo.import.all.collections},
    “initial_timestamp”: {
    “script_type”: ${mongo.initial.timestamp.script.type},
    “script”: ${mongo.initial.timestamp.script}
    },
    “skip_initial_import” : ${mongo.skip.initial.import},
    “store_statistics” : ${mongo.store.statistics},
    },
    “credentials”:
    [
    { "db": "local", "user": ${mongo.local.user}, "password": ${mongo.local.password} },
    { "db": "admin", "user": ${mongo.db.user}, "password": ${mongo.db.password} }
    ],
    “db”: ${mongo.db.name},
    “collection”: ${mongo.collection.name},
    “gridfs”: ${mongo.is.gridfs.collection},
    “filter”: ${mongo.filter}
    },
    “index”: {
    “name”: ${es.index.name},
    “throttle_size”: ${es.throttle.size},
    “bulk_size”: ${es.bulk.size},
    “type”: ${es.type.name}
    “bulk”: {
    “actions”: ${es.bulk.actions},
    “size”: ${es.bulk.size},
    “concurrent_requests”: ${es.bulk.concurrent.requests},
    “flush_interval”: ${es.bulk.flush.interval}
    }
    }
    }’

Leave a Reply