After almost two weeks and several re-installs and fresh installs, I finally got to integrate mongodb and elastic search. Here is a step by step procedure on how to integrate them.
If you follow this procedure carefully, it will prevent errors like
Exception: java.lang.NoSuchMethodError: com.mongodb.Mongo.fsyncAndLock()
{
“error” : “IndexMissingException[[testmongo] missing]”,
“status” : 404
}
Follow these guide to install MonogDB and Elastic Search
After you installed them, its time to install elastic search river.
Download the snapshot from here. Extract it and copy its contents to elastic_search_root/plugins/plugins/mongodb_river
Note: The initial implementation tutorial give on the git page points to a older version of the snapshot and it doesn’t work with the latest versions of elastic search and mongodb.
Installing the mongodb river
Run the following two commands to install the mongodb river. If you are on a slow connection, the first command can take more than 15 minutes.
ES_HOME/bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/1.4.0
ES_HOME/bin/plugin -install richardwilly98/elasticsearch-river-mongodb/1.4.0
After you install both of them, restart elasticsearch.
ES_HOME/bin/service/elasticsearch restart
Enable replica sets in mongodb by following this tutorial
Tell elastic search to index the “person” colletion in testmongo database by issuing the following command in your terminal
curl -XPUT 'http://localhost:9200/_river/mongodb/_meta' -d '{ "type": "mongodb", "mongodb": { "db": "testmongo", "collection": "person" }, "index": { "name": "mongoindex", "type": "person" } }'
add some data to the mongodb through mongo terminal
use testmongo var p = {firstName: "John", lastName: "Doe"} db.person.save(p)
Use this command to search the data
curl -XGET 'http://localhost:9200/mongoindex/_search?q=firstName:John'
Hi Satish,
With your help I am able to get rid of 404 exception. However, all my queries return NO hits. Following are the steps I have followed in sequence. Could you please help?
1. Installed MongoDB 2.4.3
2. Enabled ReplicaSet = rs0
3. Enabled oplogSize = 100
4. Restarted MongoDB Server
4. Configured the rsConf variable on MongoShell
5. Initiated the ReplicaSet
***************************************************************************
rs0:PRIMARY> rs.status()
{
“set” : “rs0”,
“date” : ISODate(“2013-05-21T17:42:41Z”),
“myState” : 1,
“members” : [
{
“_id” : 0,
“name” : “127.0.0.1:27017”,
“health” : 1,
“state” : 1,
“stateStr” : “PRIMARY”,
“uptime” : 865,
“optime” : {
“t” : 1369157994,
“i” : 1
},
“optimeDate” : ISODate(“2013-05-21T17:39:54Z”),
“self” : true
}
],
“ok” : 1
}
*************************************************************************
5. Install ES
6. Install attachment plugin
7. Install River plugin
8. Restart ES
9. Curl ES to index documents from MongoDB with following command
9. Added new documents to MongoDB
10. on MongDB db.oplog.rs.find() returns data as below
**************************************************************************
rs0:PRIMARY> db.oplog.rs.find()
{ “ts” : { “t” : 1369152540, “i” : 1 }, “h” : NumberLong(0), “v” : 2, “op” : “n”, “ns” : “”, “o” : { “msg” : “initiating set” } }
{ “ts” : { “t” : 1369152706, “i” : 1 }, “h” : NumberLong(“7734203254950592529”), “v” : 2, “op” : “i”, “ns” : “players.scores”, “o” : { “_id” : ObjectId(“519b9cc2871b3116f0b8006e”), “name” : “rohit”, “score” : 20 } }
{ “ts” : { “t” : 1369157720, “i” : 1 }, “h” : NumberLong(“2546253279621321716”), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb058b6fd31855ec8b0af”), “name” : “karthik”, “score” : 20 } }
{ “ts” : { “t” : 1369157994, “i” : 1 }, “h” : NumberLong(“-3356531630527802451”), “v” : 2, “op” : “i”, “ns” : “test.scores”, “o” : { “_id” : ObjectId(“519bb16ab6fd31855ec8b0b0”), “name” : “dinesh”, “score” : 20 } }
***************************************************************************
11. Searching on ES for newly added document to scores collection on MongoDB
does not return any results.
I have tried posting document directly to ES and it does return the results but not from the documents in MongoDB. Appeciate your help in advance.
Thank You
Thanks for the great tutorial, i had to just download the latest versions of the mapper and river plugin and it all worked 🙂
Everybody, PAY ATTENTION with VERSIONS !!!
If you use another version it doesn’t work….
https://github.com/richardwilly98/elasticsearch-river-mongodb (check correct version of river, elasticsearch and mongo )
simple question.. probably a stupid question..
The curl command to index it doesnt seem to have the hostname of the mongo replicaset master – Does this article assume that the mongo replicaset is local to the elasticsearch server?
If so, how to indicate to the elasticsearch server the hostname of mongodb?
Ok.. as a response to my dumb question,, I found this curl command from https://github.com/richardwilly98/elasticsearch-river-mongodb/wiki
Thanks!!
$ curl -XPUT “localhost:9200/_river/${es.river.name}/_meta” -d ‘
{
“type”: “mongodb”,
“mongodb”: {
“servers”:
[
{ “host”: ${mongo.instance1.host}, “port”: ${mongo.instance1.port} },
{ “host”: ${mongo.instance2.host}, “port”: ${mongo.instance2.port} }
],
“options”: {
“secondary_read_preference” : true,
“drop_collection”: ${mongo.drop.collection},
“exclude_fields”: ${mongo.exclude.fields},
“include_fields”: ${mongo.include.fields},
“include_collection”: ${mongo.include.collection},
“import_all_collections”: ${mongo.import.all.collections},
“initial_timestamp”: {
“script_type”: ${mongo.initial.timestamp.script.type},
“script”: ${mongo.initial.timestamp.script}
},
“skip_initial_import” : ${mongo.skip.initial.import},
“store_statistics” : ${mongo.store.statistics},
},
“credentials”:
[
{ “db”: “local”, “user”: ${mongo.local.user}, “password”: ${mongo.local.password} },
{ “db”: “admin”, “user”: ${mongo.db.user}, “password”: ${mongo.db.password} }
],
“db”: ${mongo.db.name},
“collection”: ${mongo.collection.name},
“gridfs”: ${mongo.is.gridfs.collection},
“filter”: ${mongo.filter}
},
“index”: {
“name”: ${es.index.name},
“throttle_size”: ${es.throttle.size},
“bulk_size”: ${es.bulk.size},
“type”: ${es.type.name}
“bulk”: {
“actions”: ${es.bulk.actions},
“size”: ${es.bulk.size},
“concurrent_requests”: ${es.bulk.concurrent.requests},
“flush_interval”: ${es.bulk.flush.interval}
}
}
}’
hello
I was trying to integrate Mongodb with elasticsearch. After installing both the plugins i restated ES and also the mongodb. Later when i tried to execute the below code on command line
curl -XPUT ‘http://localhost:9200/_river/mongodb/_meta’ -d ‘{
“type”: “mongodb”,
“mongodb”: {
“db”: “testmongo”,
“collection”: “person”
},
“index”: {
“name”: “mongoindex”,
“type”: “person”
}
}’
It gave me
{“_index”:”_river”,”_type”:”mongodb”,”_id”:”_meta”,”_version”:4,”created”:false}
Please help
thank you
hi Jayesh..
Index is already created with the name of monogindex.. so if u try to create it again, it increments the version number for every updation on same index for optimistic concurrency control.
hi Jayesh..
Index is already created with the name of monogindex.. so if u try to create it again, it increments the version number for every updation on same index for optimistic concurrency control.
Hi,
I have configured as you defined, but ES only pulls the ID column, it does not pull other column. Also, the ID column appears in ES as index ID? Any idea?
Senthamarai
Hey if i dont want to use oplog what is the other way to solve it
please update asap
Thanks With Regards
Hi All,
Can you please tell me where are you giving this curl command. In Mongo shell or through the system terminal.
Thanks a lot in advance.
Hi All,
Sorry to disturb you all with my stupid question.
I cam to know that we can use restClient,curl or postman to execute these RestAPI’s.