Better MySQL Search with Sphinx under Centos 6.5

wangking717 wrote
Recently, my website search function has slowed down, and later I found out that it was MySQL's LIKE to perform a performance problem caused by fuzzy query.
At this time, we need to use Sphinx. I install the coreseek Chinese search engine here, configure the MySQL database access interface, and use the PHP program to achieve Chinese search.

 

1. Install the compilation tools

yum install make gcc g++ gcc-c++ libtool autoconf automake imake mysql-devel libxml2-devel expat-devel

 

 

2. Download coreseek and install mmseg Chinese word segmentation

Download http://www.coreseek.cn/uploads/csft/3.2/coreseek-3.2.14.tar.gz to /usr/local/src/
cd /usr/local/src
tar zxvf coreseek-3.2.14.tar.gz #decompress
cd coreseek-3.2.14
cd mmseg-3.2.14
./bootstrap #The warning information output can be ignored. If an error occurs, it needs to be resolved
./configure --prefix=/usr/local/mmseg3  #配置
make #compile
make install #install

 

 

3. Install coreseek

cd /usr/local/src
cd coreseek-3.2.14
cd csft-3.2.14
sh buildconf.sh #The warning information output can be ignored, if there is an error, it needs to be resolved
./configure --prefix=/usr/local/coreseek  --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql  #配置
make #compile
make install #install

 

 

4. Create a test database (test database and student table)

CREATE DATABASE test DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;

CREATE TABLE `student` (  
   `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'ID',  
   `student_name` varchar(100) DEFAULT NULL COMMENT '姓名',  
   PRIMARY KEY (`id`)  
 ) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=utf8 CHECKSUM=1 DELAY_KEY_WRITE=1 ROW_FORMAT=DYNAMIC;

INSERT INTO student (student_name) VALUES ('王琨');
INSERT INTO student (student_name) VALUES ('刘杰');
INSERT INTO student (student_name) VALUES ('王希');
INSERT INTO student (student_name) VALUES ('邓紫元');

 

 

5. Configure coreseek

cp /usr/local/src/coreseek-3.2.14/testpack/etc/csft_mysql.conf /usr/local/coreseek/etc/csft_mysql.conf #Copy MySQL data source configuration file
vim /usr/local/coreseek/etc/csft_mysql.conf
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

source mySource
{
type                   = mysql
sql_host             = localhost
sql_user = root #account
sql_pass =666666 #Password
sql_db = test #MySQL database name
sql_port               = 3306
sql_query_pre     = SET NAMES utf8
sql_query = SELECT id, student_name FROM student #sql_query The first column id must be an integer
#student_name is fulltext indexed as a string/text field
sql_query_info_pre = SET NAMES utf8 #When querying from the command line, set the correct character set
sql_query_info = SELECT * FROM student WHERE id=$id #When querying from the command line, read the original data information from the database
}

#indexdefinition
index myIndex
{
source = mySource #Corresponding source name
path            = /usr/local/coreseek/var/data/my_index
docinfo = external
mlock            = 0
morphology        = none
min_word_len = 1
html_strip                = 0
#Chinese word segmentation configuration, please check for details: http://www.coreseek.cn/products-install/coreseek_mmseg/
charset_dictpath = /usr/local/mmseg3/etc
charset_type        = zh_cn.utf-8
}

#Global index definition
indexer
{
mem_limit = 1024M #Memory usage limit
max_iops = 100
max_iosize = 0
}

#searchdService Definition
searchd
{
listen                  =   9312
read_timeout        = 5
max_children        = 30
max_matches            = 1000
seamless_rotate        = 0
preopen_indexes        = 0
unlink_old            = 1
pid_file = /usr/local/coreseek/var/log/searchd_mysql.pid
log =/usr/local/coreseek/var/log/searchd_mysql.log
query_log =/usr/local/coreseek/var/log/query_mysql.log
}

 

 

6. Start coreseek and build an index

/usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf #Background startup mode
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate #Build indexing service
/usr/local/coreseek/bin/searchd -c /usr/local/coreseek/etc/csft_mysql.conf  --stop  #停止
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate #Update indexing service

 

7. Use PHP to test full-text Chinese retrieval

cp /usr/local/src/coreseek-3.2.14/testpack/api/sphinxapi.php /var/www/html/sphinxapi.php #Copy the API to the Apache root directory
cp  /usr/local/src/coreseek-3.2.14/testpack/api/test_coreseek.php  /var/www/html/test.php  
cd / var / www / html /
vim test.php
↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓

<?php
//Note that the encoding format of the file needs to be saved as UTF-8 format
require ( "sphinxapi.php" );
$cl = new SphinxClient ();
$cl->SetServer ( '127.0.0.1', 9312);
//The following settings are used to return the result in the form of an array
$cl->SetArrayResult ( true );
/*
//Filter by ID

$cl->SetIDRange(3,4);
//sql_attr_uint等类型的属性字段,需要使用setFilter过滤,类似SQL的WHERE group_id=2
$cl->setFilter('group_id',array(2));
//sql_attr_uint等类型的属性字段,也可以设置过滤范围,类似SQL的WHERE group_id2>=6 AND group_id2<=8
$cl->SetFilterRange('group_id2',6,8);
*/

//取从头开始的前20条数据,0,20类似SQl语句的LIMIT 0,20

$cl->SetLimits(0,20);

//在做索引时,没有进行 sql_attr_类型 设置的字段,可以作为“搜索字符串”,进行全文搜索
$res = $cl->Query ( '搜索字符串', "*" );    //"*"表示在所有索引里面同时搜索,"索引名称(例如test或者test,test2)"则表示搜索指定的
//如果需要搜索指定全文字段的内容,可以使用扩展匹配模式:
//$cl->SetMatchMode(SPH_MATCH_EXTENDED);
//$res=cl->Query( '@title (测试)' , "*");
//$res=cl->Query( '@title (测试) @content ('网络')' , "*");

echo '<pre>';
print_r($res['matches']);
print_r($res);
print_r($cl->GetLastError());
print_r($cl->GetLastWarning());
echo '</pre>';

?>

 

最后访问http://localhost/test.php,就可以看到测试结果了。如果运行后,是空白的,则需要将防火墙关闭再去运行。

 

关闭防火墙:

1、关闭iptables
service iptables status #查看状态
service iptables stop

2、关闭selinux
/usr/bin/setstatus -v #查看状态
vim /etc/selinux/config
找到SELINUX 行修改成为:SELINUX=disabled
reboot #重启

 

 

文章参考:

1、http://www.coreseek.cn/

2、http://www.coreseek.cn/products-install/step_by_step/

3、http://www.coreseek.cn/products-install/mysql/

4、http://www.osyunwei.com/archives/7496.html

5、http://blog.csdn.net/e421083458/article/details/21529969

 

 

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326990145&siteId=291194637