As of ES 5.x , they have given this feature out of the box with logstash plugin.
This will periodically import data from database and push to ES server.
One has to create a simple import file given below (which is also described here) and use logstash to run the script. Logstash supports running this script on a schedule.
# file: contacts-index-logstash.conf
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "pswd"
schedule => "* * * * *"
jdbc_validate_connection => true
jdbc_driver_library => "/path/to/latest/mysql-connector-java-jar"
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
statement => "SELECT * from contacts where updatedAt > :sql_last_value"
}
}
output {
elasticsearch {
protocol => http
index => "contacts"
document_type => "contact"
document_id => "%{id}"
host => "ES_NODE_HOST"
}
}
# "* * * * *" -> run every minute
# sql_last_value is a built in parameter whose value is set to Thursday, 1 January 1970,
# or 0 if use_column_value is true and tracking_column is set
You can download the mysql jar from maven here.
In case indexes do not exist in ES when this script is executed, they will be created automatically. Just like a normal post call to elasticsearch
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…