Solr timed incremental update

1. Scheduled task execution

Many people use Windows scheduled tasks or Linux's Cron to periodically access the incremental import connection to complete the function of timed incremental import. This is actually possible, and there should be no problem. But more convenient and more integrated with Solr itself is to use its own timed incremental import function.

Two, placement

1. Download apache-solr-dataimportscheduler.jar and put it in the lib directory of WEB-INF in the solr directory of Tomcat's webapps:

Download link: http://pan.baidu.com/s/1bpGnqJt

2. Modify WEB-INF/web.xml in solr and add in front of the servlet node:

<listener>  
          <listener-class>  
                org.apache.solr.handler.dataimport.scheduler.ApplicationListener  
          </listener-class>  
</listener> 

3. Create a new conf folder under solr_home\solr, put the downloaded dataimport.properties, and make corresponding modifications according to your actual needs

Download link: http://pan.baidu.com/s/1dFitqJf

#################################################
#                                               #
#       dataimport scheduler properties         #
#                                               #
#################################################
 
#  to sync or not to sync
#  1 - active; anything else - inactive
syncEnabled=1
 
#  which cores to schedule
#  in a multi-core environment you can decide which cores you want syncronized
#  leave empty or comment it out if using single-core deployment
syncCores=my_core
 
#  solr server name or IP address
#  [defaults to localhost if empty]
server=localhost
 
#  solr server port
#  [defaults to 80 if empty]
port=8089
 
#  application name/context
#  [defaults to current ServletContextListener's context (app) name]
webapp=solr
 
#  URL params [mandatory]
#  remainder of URL
params=/dataimport?command=delta-import&clean=false&commit=true
 
#  schedule interval
#  number of minutes between two runs
#  [defaults to 30 if empty]
interval=1
 
#  重做索引的时间间隔,单位分钟,默认7200,即5天; 
#  为空,为0,或者注释掉:表示永不重做索引
reBuildIndexInterval=7200
 
#  重做索引的参数
reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true
 
#  重做索引时间间隔的计时开始时间,第一次真正执行的时间=reBuildIndexBeginTime+reBuildIndexInterval*60*1000;
#  两种格式:2012-04-11 03:10:00 或者  03:10:00,后一种会自动补全日期部分为服务启动时的日期
reBuildIndexBeginTime=03:10:00

4. Edit db-data-config.xml under solr_home\solr\my_core\conf

<dataConfig>
    <dataSource name="source1"  type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/jfinal_demo?characterEncoding=utf8&amp;zeroDateTimeBehavior=convertToNull" user="root" password="123456"/>

    <span style="white-space:pre">    </span><document>    
        <span style="white-space:pre">    </span>
        <entity name="speech" dataSource="source1"     
                query="select * from  speech"    
                deltaImportQuery="select * from speech where id='${dih.delta.id}'"    
                deltaQuery="select id from speech where create_time &gt; '${dataimporter.last_index_time}'">    
        <!-- name属性,就代表着一个文档,可以随便命名 -->
        <!-- query是一条sql,代表在数据库查找出来的数据 -->
            <!-- 每一个field映射着数据库中列与文档中的域,column是数据库列,name是solr的域(必须是在managed-schema文件中配置过的域才行) -->
            <field column="id" name="s_id"/>
            <field column="content" name="s_content"/>
            <field column="operator" name="s_operator"/>
            <field column="person_synopsis" name="s_person_synopsis"/>
            <field column="person_title" name="s_person_title"/>
        </entity>
    </document>
</dataConfig>


5. Restart tomcat, and it will perform regular queries and incremental updates every 1 minute. (At this time, you can add a new record to the database, and after a minute of query, you will find an additional index)

6. After the index is successfully updated, the dataimport.properties file under solr_home\solr\my_core\conf will be updated at the same time

#Mon Aug 07 16:11:13 CST 2017
last_index_time=2017-08-07 16\:11\:13
speech.last_index_time=2017-08-07 16\:11\:13

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325979090&siteId=291194637