wangking717 wrote
Recently, my website search function has slowed down, and later I found out that it was MySQL's LIKE to perform a performance problem caused by fuzzy query.
At this time, we need to use Sphinx. I install the coreseek Chinese search engine here, configure the MySQL database access interface, and use the PHP program to achieve Chinese search.
At this time, we need to use Sphinx. I install the coreseek Chinese search engine here, configure the MySQL database access interface, and use the PHP program to achieve Chinese search.
1. Install the compilation tools
yum install make gcc g++ gcc-c++ libtool autoconf automake imake mysql-devel libxml2-devel expat-devel
2. Download coreseek and install mmseg Chinese word segmentation
Download http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz to /usr/local/src/ cd /usr/local/src tar zxvf coreseek-3.2.14.tar.gz #decompress cd coreseek-3.2.14 cd mmseg-3.2.14 ./bootstrap #The warning information output can be ignored. If an error occurs, it needs to be resolved ./configure --prefix=/usr/local/mmseg3 #配置 make #compile make install #install
3. Install coreseek
cd /usr/local/src cd coreseek-3.2.14 cd csft-3.2.14 sh buildconf.sh #The warning information output can be ignored, if there is an error, it needs to be resolved ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql #配置 make #compile make install #install
4. Create a test database (test database and student table)
CREATE DATABASE test DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci; CREATE TABLE `student` ( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'ID', `student_name` varchar(100) DEFAULT NULL COMMENT '姓名', PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=utf8 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC; INSERT INTO student (student_name) VALUES ('王琨'); INSERT INTO student (student_name) VALUES ('刘杰'); INSERT INTO student (student_name) VALUES ('王希'); INSERT INTO student (student_name) VALUES ('邓紫元');
5. Configure coreseek
cp /usr/local/src/coreseek-3.2.14/testpack/etc/csft_mysql.conf /usr/local/coreseek/etc/csft_mysql.conf #Copy MySQL data source configuration file vim /usr/local/coreseek/etc/csft_mysql.conf ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓ source mySource { type = mysql sql_host = localhost sql_user = root #account sql_pass =666666 #Password sql_db = test #MySQL database name sql_port = 3306 sql_query_pre = SET NAMES utf8 sql_query = SELECT id, student_name FROM student #sql_query The first column id must be an integer #student_name is fulltext indexed as a string/text field sql_query_info_pre = SET NAMES utf8 #When querying from the command line, set the correct character set sql_query_info = SELECT * FROM student WHERE id=$id #When querying from the command line, read the original data information from the database } #indexdefinition index myIndex { source = mySource #Corresponding source name path = /usr/local/coreseek/var/data/my_index docinfo = external mlock = 0 morphology = none min_word_len = 1 html_strip = 0 #Chinese word segmentation configuration, please check for details: http://www.coreseek.cn/products-install/coreseek_mmseg/ charset_dictpath = /usr/local/mmseg3/etc charset_type = zh_cn.utf-8 } #Global index definition indexer { mem_limit = 1024M #Memory usage limit max_iops = 100 max_iosize = 0 } #searchdService Definition searchd { listen = 9312 read_timeout = 5 max_children = 30 max_matches = 1000 seamless_rotate = 0 preopen_indexes = 0 unlink_old = 1 pid_file = /usr/local/coreseek/var/log/searchd_mysql.pid log =/usr/local/coreseek/var/log/searchd_mysql.log query_log =/usr/local/coreseek/var/log/query_mysql.log }
6. Start coreseek and build an index
/usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf #Background startup mode /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate #Build indexing service /usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf --stop #停止 /usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate #Update indexing service
7. Use PHP to test full-text Chinese retrieval
cp /usr/local/src/coreseek-3.2.14/testpack/api/sphinxapi.php /var/www/html/sphinxapi.php #Copy the API to the Apache root directory cp /usr/local/src/coreseek-3.2.14/testpack/api/test_coreseek.php /var/www/html/test.php cd / var / www / html / vim test.php ↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓ <?php //Note that the encoding format of the file needs to be saved as UTF-8 format require ( "sphinxapi.php" ); $cl = new SphinxClient (); $cl->SetServer ( '127.0.0.1', 9312); //The following settings are used to return the result in the form of an array $cl->SetArrayResult ( true ); /* //Filter by ID $cl->SetIDRange(3,4); //sql_attr_uint等类型的属性字段,需要使用setFilter过滤,类似SQL的WHERE group_id=2 $cl->setFilter('group_id',array(2)); //sql_attr_uint等类型的属性字段,也可以设置过滤范围,类似SQL的WHERE group_id2>=6 AND group_id2<=8 $cl->SetFilterRange('group_id2',6,8); */ //取从头开始的前20条数据,0,20类似SQl语句的LIMIT 0,20 $cl->SetLimits(0,20); //在做索引时,没有进行 sql_attr_类型 设置的字段,可以作为“搜索字符串”,进行全文搜索 $res = $cl->Query ( '搜索字符串', "*" ); //"*"表示在所有索引里面同时搜索,"索引名称(例如test或者test,test2)"则表示搜索指定的 //如果需要搜索指定全文字段的内容,可以使用扩展匹配模式: //$cl->SetMatchMode(SPH_MATCH_EXTENDED); //$res=cl->Query( '@title (测试)' , "*"); //$res=cl->Query( '@title (测试) @content ('网络')' , "*"); echo '<pre>'; print_r($res['matches']); print_r($res); print_r($cl->GetLastError()); print_r($cl->GetLastWarning()); echo '</pre>'; ?>
最后访问http://localhost/test.php,就可以看到测试结果了。如果运行后,是空白的,则需要将防火墙关闭再去运行。
关闭防火墙:
1、关闭iptables service iptables status #查看状态 service iptables stop 2、关闭selinux /usr/bin/setstatus -v #查看状态 vim /etc/selinux/config 找到SELINUX 行修改成为:SELINUX=disabled reboot #重启
文章参考:
1、http://www.coreseek.cn/
2、http://www.coreseek.cn/products-install/step_by_step/
3、http://www.coreseek.cn/products-install/mysql/
4、http://www.osyunwei.com/archives/7496.html
5、http://blog.csdn.net/e421083458/article/details/21529969