1. sphinx php 学习可以参考 /var/www/to8to/trunk/gb_php/sphinxapi.php
    2. include 'sphinxapi.php';
    3. $sp = new SphinxClient;
    4. $sp->SetServer('127.0.0.1', 9314);
    5. $sp->SetConnectTimeout(5);
    6. $sp->SetLimits(0, 10);//($start, $limit);
    7. $keyword=(isset($_GET['kw'])&& !empty($_GET['kw'])) ?trim($_GET['kw']) : '搜索内容';
    8. //在执行搜索之前,可以加入各种条件
    9. $result=$sp>Query($keyword,'iiyicms');//'*' 'iiyicms:iiyicms_increment'
    10. 学习地址
    11. http://www.21andy.com/new/20100928/1973.html
    12. 遇到问题
    13. 编绎coreseek以支持python的时候,提示:
    14. py_layer.h:16:27: 致命错误: Python.h:没有那个文件或目录
    15. 找不到python.h头文件
    16. 解决方法:
    17. ubuntu下安装python-dev
    18. sudo apt-get install python-dev
    19. http://pecl.php.net/package/sphinx
    20. sphinx php 学习可以参考 /var/www/to8to/trunk/gb_php/sphinxapi.php
    21. include 'sphinxapi.php';
    22. $sp = new SphinxClient;
    23. $sp->SetServer('127.0.0.1', 9314);
    24. $sp->SetConnectTimeout(5);
    25. $sp->SetLimits(0, 10);//($start, $limit);
    26. $keyword=(isset($_GET['kw'])&& !empty($_GET['kw'])) ?trim($_GET['kw']) : '搜索内容';
    27. //在执行搜索之前,可以加入各种条件
    28. $result=$sp>Query($keyword,'iiyicms');//'*' 'iiyicms:iiyicms_increment'
    29. new sphinxapi url
    30. https://code.google.com/p/sphinxsearch/source/browse/trunk/api/
    31. http://code.google.com/p/sphinxsearch/source/browse/trunk/api/sphinxapi.php
    32. 1PHP+Mysql+Sphinx高效的站内搜索引擎搭建详释
    33. http://jingyan.baidu.com/article/95c9d20d9a7176ec4e756119.html
    34. 2、使用Coreseek-4.1快速搭建Sphinx中文分词 Php-Mysql 全文检索 搜索引擎
    35. http://www.gretheer.com/2014/07/install-coreseek-sphinx-php-mysql.html
    36. 3sphinx 下载地址
    37. http://sphinxsearch.com/downloads/release/
    38. 找到对应版本
    39. http://sphinxsearch.com/files/sphinxsearch_2.2.6-release-1~wheezy_i386.deb
    40. 安装学习
    41. http://sphinxsearch.com/docs/archives/1.10/installing.html
    42. 两个结合安装
    43. http://blog.atime.me/note/sphinx-coreseek-summary.html
    44. http://github.tiankonguse.com/blog/2014/11/03/coreseek-install-log/
    45. 开发环境
    46. 操作系统: Ubuntu 12.04 x86-64
    47. Coreseek: 4.1测试版(Sphinx-2.0.1)
    48. Python: 2.7
    49. Sphinx/Coreseek简介
    50. Sphinx是一个高性能的全文检索引擎,使用C++语言开发,采用GPL协议发布,可购买商业授权,目前的稳定版本是2.1.7
    51. Coreseek是基于Sphinx的中文全文检索引擎,使用MMSEG算法进行中文分词,并且提供Python数据源。Coreseek采用GPLv2协议发布,可购买商业授权,目前的稳定版本是3.2.14,基于Sphinx-0.9.9,测试版本是4.1,基于Sphinx-2.0.1。(另外,Coreseek官方论坛在2013年的年末称即将发布5.0版本,不过至今无详细消息)
    52. Sphinx/Coreseek安装
    53. 下载Coreseek-4.1的源代码
    54. wget http://www.coreseek.cn/uploads/csft/4.0/coreseek-4.1-beta.tar.gz
    55. tar xvf coreseek-4.1.beta.tar.gz
    56. cd coreseek-4.1-beta
    57. 解压后发现有三个目录,主要的目录结构如下
    58. coreseek-4.1-beta/
    59. csft-4.1/ coreseek修改sphinx-2.0.1后的代码
    60. api/ sphinx searchd[查询API][6]的实现
    61. mmseg-3.2.14/ libmmseg分词库
    62. testpack/ 测试和配置示例
    63. README.txt 介绍和安装指南
    64. 按照官方的安装指南,依次安装mmsegcsft。如果在configure过程中提示缺少头文件,可通过apt-file查询需要安装的软件包。
    65. 安装mmseg-3.2.14 http://www.coreseek.cn/uploads/csft/3.2/mmseg-3.2.14.tar.gz
    66. 这里完全参考官方的安装指南即可
    67. cd mmseg-3.2.14
    68. ./bootstrap
    69. ./configure --prefix=/usr/local/mmseg3
    70. make && sudo make install
    71. 安装libiconv-1.14
    72. 先安装libiconv,用于字符集编码的转换。
    73. wget http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.14.tar.gz
    74. tar xvf libiconv-1.14.tar.gz
    75. cd libiconv-1.14
    76. ./configure
    77. make && sudo make install && ldconfig
    78. 如果你的glibc版本在2.16以上,make时很有可能出现如下错误
    79. In file included from progname.c:26:0:
    80. ./stdio.h:1010:1: error: 'gets' undeclared here (not in a function)
    81. _GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
    82. ^
    83. 参考这里的方法,下载patch文件,解压后打上patch即可。
    84. libiconv-1.14目录下执行
    85. wget -O - http://blog.atime.me/static/resource/libiconv-glibc-2.16.patch.gz | gzip -d - | patch -p0
    86. 或者考虑直接注释掉srclib/stdio.in.h文件的第698行(应该没问题),即
    87. // _GL_WARN_ON_USE (gets, "gets is a security hole - use fgets instead");
    88. 安装csft-4.1
    89. 这里configure的参数和安装指南上稍有区别,一是添加--with-python选项来支持Python数据源,二是添加LIBS=-liconv来避免最后的链接错误。
    90. cd csft-4.1
    91. sh buildconf.sh
    92. ./configure --prefix=/usr/local/coreseek --without-unixodbc --with-mmseg --with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ --with-mmseg-libs=/usr/local/mmseg3/lib/ --with-mysql --with-python LIBS=-liconv
    93. make -j2 && sudo make install
    94. 如果sh buildconf.sh最后没有生成configure脚本,且提示automake: warnings are treated as errors,可以将configure.ac中的这行
    95. AM_INIT_AUTOMAKE([-Wall -Werror foreign])
    96. 改为
    97. AM_INIT_AUTOMAKE([-Wall foreign])
    98. 即删掉-Werror,然后重新运行sh buildconf.sh
    99. 如果configure的时候提示没有安装MySQL的头文件,安装libmysql++-dev包即可。
    100. 如果你的gcc版本在4.7以上,编译的时候可能会因为sphinx的一个bug报错
    101. sphinxexpr.cpp:1746:43: error: 'ExprEval' was not declared in this scope, and no declarations were found by argument-dependent lookup at the point of instantiation [-fpermissive]
    102. 解决方法参考bug报告里的一个patch,在csft-4.1目录下执行
    103. wget -O - http://blog.atime.me/static/resource/sphinxexpr-gcc4.7.patch.gz | gzip -d - | patch -p0
    104. 或者你也可以直接修改src/sphixexpr.cpp文件的1746, 17771823行,将三行中的ExprEval改为this->ExprEval
    105. 安装辅助工具
    106. csft-4.1/contrib/scripts目录下的searchd脚本拷贝到/etc/init.d/目录下,即可使用service命令启动和终止searchd服务。
    107. 安装好coreseek后,将/usr/local/coreseek/share/man/目录下的所有文件和目录都拷贝到/usr/local/share/man/目录里,即可使用man命令查看indexersearchd的使用手册。
    108. Sphinx/Coreseek目录结构
    109. 按照上面的步骤正确安装Coreseek后,在/usr/local/coreseek可看到如下几个文件夹
    110. bin/ sphinx的程序目录
    111. searchd 搜索服务器程序
    112. indexer 索引建立工具
    113. etc/ 配置文件目录
    114. csft.conf 默认配置文件
    115. share/
    116. man/ sphinxman手册,建议拷贝到系统man目录,方便查询
    117. var/
    118. data/ 默认的索引存放目录
    119. log/ 默认的日志目录和pid文件目录
    120. 实际使用sphinx的流程大概如下:
    121. 使用indexer建立或更新索引,如果searchd已经运行,则需要使用--rotate选项。
    122. 运行searchd
    123. 例如:
    124. cd /usr/local/coreseek
    125. ./bin/indexer --all # 第一次建立索引,使用默认配置文件/usr/local/coreseek/etc/csft.conf
    126. ./bin/searchd # 使用默认配置文件/usr/local/coreseek/etc/csft.conf
    127. Sphinx/Coreseek配置
    128. 配置文件可参考Sphinx的官方文档和配置例子/usr/local/coreseek/etc/sphinx.conf.dist
    129. searchd
    130. 配置示例
    131. searchd
    132. {
    133. listen = 9312
    134. listen = 9306:mysql41
    135. log = /usr/local/coreseek/var/log/searchd.log
    136. query_log = /usr/local/coreseek/var/log/query.log
    137. read_timeout = 5
    138. max_children = 30
    139. pid_file = /usr/local/coreseek/var/log/searchd.pid
    140. max_matches = 1000
    141. seamless_rotate = 1
    142. preopen_indexes = 1
    143. unlink_old = 1
    144. workers = threads # for RT to work
    145. }
    146. 这里面的诸多配置选项可参考searchd program configuration options
    147. 其中,通过第二个listen配置listen = 9306:mysql41,你可以使用mysqlclient来访问searchd的索引。
    148. mysql -h 127.0.0.1 -P 9306
    149. 然后使用SphinxQL查询语言即可搜索索引。
    150. indexer
    151. 配置示例
    152. indexer {
    153. mem_limit = 1024M
    154. write_buffer = 16M
    155. }
    156. 索引工具indexer的配置相对少一些,参考indexer program configuration options。需要注意的是,mem_limit如果查过2048M会出问题1
    157. 数据源和索引配置
    158. 参考示例配置文件/usr/local/coreseek/etc/sphinx.conf.dist和官方文档Data source configuration optionsIndex configuration options即可。
    159. 数据源
    160. 关于数据源,需要注意的是:
    161. 每条数据的document id必须是唯一的正整数(不能为0)。6
    162. Python数据源
    163. Coreseek开发了一个号称万能的Python数据源,使用起来比xmlpipe2要方便一些。其实就是用Python脚本来获取待索引数据,配置文档见这里,接口文档见这里,示例程序见这里。
    164. Xmlpipe2数据源
    165. 这是用Sphinx官方支持的一个"万能"数据源,其实就是将待索引数据按照xmlpipe2schema写入标准输出中。
    166. 在数据源的配置项中需要设置typexmlpipe2,另外还要设置一个xmlpipe_command选项,该选项的命令必须输出符合xmlpipe2 schemaxml文档到标准输出流(stdout)里,比如:
    167. source news_src
    168. {
    169. type = xmlpipe2
    170. xmlpipe_command = cat /tmp/xmlpipe2_out.xml
    171. }
    172. 建立索引
    173. 使用indexer命令建立索引
    174. /usr/local/coreseek/bin/indexer --rotate $INDEX_NAME
    175. Sphinx使用indexer工具建立和更新索引,据称indexer的索引速度能达到10~15MB/秒2。实际使用过程中,我尝试过分别用Python数据源和xmlpipe2数据源来建立索引,xmlpipe2稍微快一点点。使用Python数据源索引14G文本,大约50万个文件,最后生成2.3G索引,最快在2.8MB/秒左右,估计是慢在中文分词上。
    176. 自定义中文词库
    177. 见这篇文章。
    178. 查询
    179. Sphinx支持使用SphinxAPISphinxQL查询数据。
    180. SphinxAPI
    181. SphinxAPI用于和searchd通信,官方提供PHP, PythonJava的实现,API的文档见此。Coreseek携带的API和示例程序实现都放在csft-4.1/api/目录下。
    182. SphinxQL
    183. SphinxQLSphinx提供的SQL方言,用于查询和管理索引,相比SphinxAPISphinxQL支持的操作更多,比如删除索引等,文档在此。
    184. 实际应用
    185. 项目简介
    186. 项目的部分需求:
    187. 目前需要做全文检索的数据是html网页文件,总数在1000万左右,文件总大小大概是200GB,每天新增几千个文件左右。将来很可能需要检索pdfmysql等不同的数据来源。
    188. 提供RESTful风格的搜索接口,返回json格式的查询结果。因为搜索服务主要是内部使用,估计搜索请求的压力不大。
    189. 为缩短开发周期,整个项目采用Python实现,使用coreseek自带的Python数据源建立索引。
    190. 在开发过程中使用了如下的第三方Python packages:
    191. lxml-3.3.4: 解析html文件
    192. tornado-3.2: 异步http服务器,异步socket通信等
    193. 设计考量
    194. 索引
    195. 上面有提到过,indexer是一个单线程的工具,建立中文索引的速度基本上很难超过3MB/秒,因此可以考虑将大的索引拆分成若干小索引,这些小索引可以同时建立,最后再合并成一个完整的索引。
    196. 因为待索引文档的基数很大,但每天更新的数量又比较小,所以建立索引的时候最好使用官方推荐的一种Main + Delta的方式,主(Main)索引只需要最开始建立一次,然后每天重建一次增量(Delta)索引并合并到主索引中,相关文档见Delta index updates
    197. Python相关
    198. 项目里需要使用Python查找和解析html文件。
    199. 文件查找没有使用Python标准库oswalk函数,当文件数量较多时,walk函数的效率会比较低。有兴趣的可以看下一个叫betterwalk的第三方库,据称比os.walk快不少。实际项目中,因为待索引文件的目录结构固定且很有规律,直接用os.listdiros.lstat即可解决,os.lstat可以获取文件的最后修改日期,在建立增量索引的时候非常有用。
    200. html文件的解析使用了口碑很给力的lxml库,用lxml解析html文件时通常有多种方法,使用之前最好仔细看一下lxml各个函数的benchmark,了解一下哪种方法更快一些,比如使用xpath查找html节点时,lxmlXPath类比xpath()函数要快好几倍3
    201. 另外,Python的多线程处理计算密集型(CPU Bound)任务是一个众所周知的大坑,比如多线程解析html文件。这时最好用多进程分别做解析任务,然后将解析好的文件收集起来。
    202. 前面说过indexer比较慢,一般建立索引的时候,速度瓶颈就在indexer上。为了尽量加快整体建立索引的速度,比较靠谱的方法是将文件扫描,文件解析和indexer索引这三步同时进行,由于indexer无法及时索引解析好的文件,因此必须将解析好的文件缓存起来,比如缓存在内存里。然而内存是紧俏资源,必须限量节约使用。
    203. 关于内存的限量使用,在实现时可以为缓存设定一个阀值,缓存满了就先暂停所有的文件扫描和解析进程,等缓存快没了的时候再继续,在Linux上使用SIGSTOPSIGCONT信号可以很容易就实现这一功能。相比之下,如何准确的获取缓存对象所占用的内存大小倒是比较困难,折中的办法是统计整个进程的内存占用或是间接的方法,或者干脆通过限制缓存对象的数目来做限制(这个比较弱智的感觉)。
    204. 关于内存的节约使用,大家都知道一般的Python对象都会自动创建一个__dict__属性来存储其他的属性,然而不太广为人知的是,Python的内置类型dict是一个内存大户,当Python对象少的时候可能很难发现,如果在内存里存储十万或一百万个Python对象时,用Memory Profiler(比如Heapy)做下profiling你会发现,光是__dict__本身(不包括存在__dict__里的数据)就能吃掉你巨量的内存。
    205. 通过设置类属性__slots__可以禁止__dict__属性的自动创建,其中一个成功故事在这里,这个哥们通过__slots__节约了9G内存。需要说明的是,__slots__会带来一些负面作用,比较明显的一个是,使用version 0版本的pickle协议序列化定义了__slots__属性的对象会有报错,但使用更高级别的pickle协议则没问题4(一般很少用到cPickleprotocol version 0,因为又慢又占空间)。
    206. 另外缓存所使用的数据结构也比较重要,直接用Python的内置类型list肯定不行,因为缓存应该是一个FIFO的队列,而del(list[0])操作是O(n)的复杂度5,用collections.deque比较合适。
    207. 使用中遇到的问题
    208. 索引文件损坏导致searchd崩溃
    209. 测试时发现搜索部分关键词的时候,searchd会因为断言失败后crash并自动重启。经调试和在网上查资料,发现有个比较大的索引(2G左右)很可能在merge的时候发生了损坏,用indextool --check检查对应的索引后,输出大量的FAILED, row not found错误,目前除了升级sphinx和重建损坏的索引,貌似没有别的解决方法。
    210. socket接收数据超时
    211. 使用sphinxapi.py提供的接口和searchd通讯时,如果索引较大,searchd可能响应较慢,此时很有可能会报socket超时的异常。python里阻塞的socket的默认超时时间是1秒,解决的方法比较简单,直接调用sphinxapi里的SetConnectTimeout函数设置超时即可。
    212. 资源和参考资料
    213. Sphinx 2.0.1 Documentation
    214. Coreseek与第四城搜索,有很多性能相关的测试,很详尽。
    215. Coreseek python数据源接口文档
    216. 脚注
    217. Sphinx indexer program configuration options, mem_limit,引用于2014-04-17
    218. Wikipedia:Sphinx,引用于2014-04-17
    219. lxml benchmarks and speed, xpath,引用于2014-04-18
    220. python pickling slots error,引用于2014-04-18
    221. Python Time Complexity,引用于2014-04-18
    222. Restrictions on the source data,引用于2014-09-18
    223. Linux下编译安装Sphinx、中文分词coreseekPHPsphinx扩展
    224. http://www.icultivator.com/p/6347.html
    225. 使用 sphinx 需要做以下几件事
    226. 1.有数据;
    227. 2.建立 sphinx 配置文件;
    228. 3.生成索引;
    229. 4.启动 searchd 服务进程,默认是9312
    230. 5. PHP 去连接 sphinx 服务
    231. 启动 sphinx
    232. cd /usr/local/coreseek/bin/
    233. ./searchd
    234. 启动命令
    235. searchd 命令参数介绍:
    236. -c 指定配置文件
    237. --stop 停止服务
    238. --pidfile 用来显式指定一个 PID 文件
    239. -p 指定端口
    240. 5php 安装 sphinx 扩展
    241. sudo pecl install sphinx
    242. 如果出现错误:"configure: error: Cannot find libsphinxclient headers"
    243. 解决方法:
    244. cd coreseek-4.1/csft-4.1/api/libsphinxclient/
    245. ./configure --prefix=/usr/local/libsphinxclient
    246. sudo make && make install
    247. 解决完毕!
    248. 回去接着执行
    249. ./configure --with-php-config=/usr/local/php/bin/php-config --with-sphinx=/usr/local/libsphinxclient
    250. sudo make && make install
    251. 出现类似"Installing shared extensions: /usr/lib/php5/20090626/sphinx.so",表示成功。
    252. 可以进入该目录下会发现生成了一个 sphinx.so 文件
    253. php.ini 中加载该 so 文件
    254. extension=/usr/lib/php5/20090626/sphinx.so
    255. 重启 apache phpinfo() 中出现这个表明成功。
    256. sphinx
    257. <?php
    258. //
    259. // $Id: sphinxapi.php 1566 2008-11-17 19:06:44Z shodan $
    260. //
    261. //
    262. // Copyright (c) 2001-2008, Andrew Aksyonoff. All rights reserved.
    263. //
    264. // This program is free software; you can redistribute it and/or modify
    265. // it under the terms of the GNU General Public License. You should have
    266. // received a copy of the GPL license along with this program; if you
    267. // did not, you can find it at http://www.gnu.org/
    268. //
    269. /////////////////////////////////////////////////////////////////////////////
    270. // PHP version of Sphinx searchd client (PHP API)
    271. /////////////////////////////////////////////////////////////////////////////
    272. /// known searchd commands
    273. define ( "SEARCHD_COMMAND_SEARCH", 0 );
    274. define ( "SEARCHD_COMMAND_EXCERPT", 1 );
    275. define ( "SEARCHD_COMMAND_UPDATE", 2 );
    276. define ( "SEARCHD_COMMAND_KEYWORDS",3 );
    277. define ( "SEARCHD_COMMAND_PERSIST", 4 );
    278. /// current client-side command implementation versions
    279. define ( "VER_COMMAND_SEARCH", 0x116 );
    280. define ( "VER_COMMAND_EXCERPT", 0x100 );
    281. define ( "VER_COMMAND_UPDATE", 0x102 );
    282. define ( "VER_COMMAND_KEYWORDS", 0x100 );
    283. /// known searchd status codes
    284. define ( "SEARCHD_OK", 0 );
    285. define ( "SEARCHD_ERROR", 1 );
    286. define ( "SEARCHD_RETRY", 2 );
    287. define ( "SEARCHD_WARNING", 3 );
    288. /// known match modes
    289. define ( "SPH_MATCH_ALL", 0 );
    290. define ( "SPH_MATCH_ANY", 1 );
    291. define ( "SPH_MATCH_PHRASE", 2 );
    292. define ( "SPH_MATCH_BOOLEAN", 3 );
    293. define ( "SPH_MATCH_EXTENDED", 4 );
    294. define ( "SPH_MATCH_FULLSCAN", 5 );
    295. define ( "SPH_MATCH_EXTENDED2", 6 ); // extended engine V2 (TEMPORARY, WILL BE REMOVED)
    296. /// known ranking modes (ext2 only)
    297. define ( "SPH_RANK_PROXIMITY_BM25", 0 ); ///< default mode, phrase proximity major factor and BM25 minor one
    298. define ( "SPH_RANK_BM25", 1 ); ///< statistical mode, BM25 ranking only (faster but worse quality)
    299. define ( "SPH_RANK_NONE", 2 ); ///< no ranking, all matches get a weight of 1
    300. define ( "SPH_RANK_WORDCOUNT", 3 ); ///< simple word-count weighting, rank is a weighted sum of per-field keyword occurence counts
    301. define ( "SPH_RANK_PROXIMITY", 4 );
    302. define ( "SPH_RANK_MATCHANY", 5 );
    303. /// known sort modes
    304. define ( "SPH_SORT_RELEVANCE", 0 );
    305. define ( "SPH_SORT_ATTR_DESC", 1 );
    306. define ( "SPH_SORT_ATTR_ASC", 2 );
    307. define ( "SPH_SORT_TIME_SEGMENTS", 3 );
    308. define ( "SPH_SORT_EXTENDED", 4 );
    309. define ( "SPH_SORT_EXPR", 5 );
    310. /// known filter types
    311. define ( "SPH_FILTER_VALUES", 0 );
    312. define ( "SPH_FILTER_RANGE", 1 );
    313. define ( "SPH_FILTER_FLOATRANGE", 2 );
    314. /// known attribute types
    315. define ( "SPH_ATTR_INTEGER", 1 );
    316. define ( "SPH_ATTR_TIMESTAMP", 2 );
    317. define ( "SPH_ATTR_ORDINAL", 3 );
    318. define ( "SPH_ATTR_BOOL", 4 );
    319. define ( "SPH_ATTR_FLOAT", 5 );
    320. define ( "SPH_ATTR_BIGINT", 6 );
    321. define ( "SPH_ATTR_MULTI", 0x40000000 );
    322. /// known grouping functions
    323. define ( "SPH_GROUPBY_DAY", 0 );
    324. define ( "SPH_GROUPBY_WEEK", 1 );
    325. define ( "SPH_GROUPBY_MONTH", 2 );
    326. define ( "SPH_GROUPBY_YEAR", 3 );
    327. define ( "SPH_GROUPBY_ATTR", 4 );
    328. define ( "SPH_GROUPBY_ATTRPAIR", 5 );
    329. // important properties of PHP's integers:
    330. // - always signed (one bit short of PHP_INT_SIZE)
    331. // - conversion from string to int is saturated
    332. // - float is double
    333. // - div converts arguments to floats
    334. // - mod converts arguments to ints
    335. // the packing code below works as follows:
    336. // - when we got an int, just pack it
    337. // if performance is a problem, this is the branch users should aim for
    338. //
    339. // - otherwise, we got a number in string form
    340. // this might be due to different reasons, but we assume that this is
    341. // because it didn't fit into PHP int
    342. //
    343. // - factor the string into high and low ints for packing
    344. // - if we have bcmath, then it is used
    345. // - if we don't, we have to do it manually (this is the fun part)
    346. //
    347. // - x64 branch does factoring using ints
    348. // - x32 (ab)uses floats, since we can't fit unsigned 32-bit number into an int
    349. //
    350. // unpacking routines are pretty much the same.
    351. // - return ints if we can
    352. // - otherwise format number into a string
    353. /// pack 64-bit signed
    354. function sphPackI64 ( $v )
    355. {
    356. assert ( is_numeric($v) );
    357. // x64
    358. if ( PHP_INT_SIZE>=8 )
    359. {
    360. $v = (int)$v;
    361. return pack ( "NN", $v>>32, $v&0xFFFFFFFF );
    362. }
    363. // x32, int
    364. if ( is_int($v) )
    365. return pack ( "NN", $v < 0 ? -1 : 0, $v );
    366. // x32, bcmath
    367. if ( function_exists("bcmul") )
    368. {
    369. if ( bccomp ( $v, 0 ) == -1 )
    370. $v = bcadd ( "18446744073709551616", $v );
    371. $h = bcdiv ( $v, "4294967296", 0 );
    372. $l = bcmod ( $v, "4294967296" );
    373. return pack ( "NN", (float)$h, (float)$l ); // conversion to float is intentional; int would lose 31st bit
    374. }
    375. // x32, no-bcmath
    376. $p = max(0, strlen($v) - 13);
    377. $lo = abs((float)substr($v, $p));
    378. $hi = abs((float)substr($v, 0, $p));
    379. $m = $lo + $hi*1316134912.0; // (10 ^ 13) % (1 << 32) = 1316134912
    380. $q = floor($m/4294967296.0);
    381. $l = $m - ($q*4294967296.0);
    382. $h = $hi*2328.0 + $q; // (10 ^ 13) / (1 << 32) = 2328
    383. if ( $v<0 )
    384. {
    385. if ( $l==0 )
    386. $h = 4294967296.0 - $h;
    387. else
    388. {
    389. $h = 4294967295.0 - $h;
    390. $l = 4294967296.0 - $l;
    391. }
    392. }
    393. return pack ( "NN", $h, $l );
    394. }
    395. /// pack 64-bit unsigned
    396. function sphPackU64 ( $v )
    397. {
    398. assert ( is_numeric($v) );
    399. // x64
    400. if ( PHP_INT_SIZE>=8 )
    401. {
    402. assert ( $v>=0 );
    403. // x64, int
    404. if ( is_int($v) )
    405. return pack ( "NN", $v>>32, $v&0xFFFFFFFF );
    406. // x64, bcmath
    407. if ( function_exists("bcmul") )
    408. {
    409. $h = bcdiv ( $v, 4294967296, 0 );
    410. $l = bcmod ( $v, 4294967296 );
    411. return pack ( "NN", $h, $l );
    412. }
    413. // x64, no-bcmath
    414. $p = max ( 0, strlen($v) - 13 );
    415. $lo = (int)substr ( $v, $p );
    416. $hi = (int)substr ( $v, 0, $p );
    417. $m = $lo + $hi*1316134912;
    418. $l = $m % 4294967296;
    419. $h = $hi*2328 + (int)($m/4294967296);
    420. return pack ( "NN", $h, $l );
    421. }
    422. // x32, int
    423. if ( is_int($v) )
    424. return pack ( "NN", 0, $v );
    425. // x32, bcmath
    426. if ( function_exists("bcmul") )
    427. {
    428. $h = bcdiv ( $v, "4294967296", 0 );
    429. $l = bcmod ( $v, "4294967296" );
    430. return pack ( "NN", (float)$h, (float)$l ); // conversion to float is intentional; int would lose 31st bit
    431. }
    432. // x32, no-bcmath
    433. $p = max(0, strlen($v) - 13);
    434. $lo = (float)substr($v, $p);
    435. $hi = (float)substr($v, 0, $p);
    436. $m = $lo + $hi*1316134912.0;
    437. $q = floor($m / 4294967296.0);
    438. $l = $m - ($q * 4294967296.0);
    439. $h = $hi*2328.0 + $q;
    440. return pack ( "NN", $h, $l );
    441. }
    442. // unpack 64-bit unsigned
    443. function sphUnpackU64 ( $v )
    444. {
    445. list ( $hi, $lo ) = array_values ( unpack ( "N*N*", $v ) );
    446. if ( PHP_INT_SIZE>=8 )
    447. {
    448. if ( $hi<0 ) $hi += (1<<32); // because php 5.2.2 to 5.2.5 is totally fucked up again
    449. if ( $lo<0 ) $lo += (1<<32);
    450. // x64, int
    451. if ( $hi<=2147483647 )
    452. return ($hi<<32) + $lo;
    453. // x64, bcmath
    454. if ( function_exists("bcmul") )
    455. return bcadd ( $lo, bcmul ( $hi, "4294967296" ) );
    456. // x64, no-bcmath
    457. $C = 100000;
    458. $h = ((int)($hi / $C) << 32) + (int)($lo / $C);
    459. $l = (($hi % $C) << 32) + ($lo % $C);
    460. if ( $l>$C )
    461. {
    462. $h += (int)($l / $C);
    463. $l = $l % $C;
    464. }
    465. if ( $h==0 )
    466. return $l;
    467. return sprintf ( "%d%05d", $h, $l );
    468. }
    469. // x32, int
    470. if ( $hi==0 )
    471. {
    472. if ( $lo>0 )
    473. return $lo;
    474. return sprintf ( "%u", $lo );
    475. }
    476. $hi = sprintf ( "%u", $hi );
    477. $lo = sprintf ( "%u", $lo );
    478. // x32, bcmath
    479. if ( function_exists("bcmul") )
    480. return bcadd ( $lo, bcmul ( $hi, "4294967296" ) );
    481. // x32, no-bcmath
    482. $hi = (float)$hi;
    483. $lo = (float)$lo;
    484. $q = floor($hi/10000000.0);
    485. $r = $hi - $q*10000000.0;
    486. $m = $lo + $r*4967296.0;
    487. $mq = floor($m/10000000.0);
    488. $l = $m - $mq*10000000.0;
    489. $h = $q*4294967296.0 + $r*429.0 + $mq;
    490. $h = sprintf ( "%.0f", $h );
    491. $l = sprintf ( "%07.0f", $l );
    492. if ( $h=="0" )
    493. return sprintf( "%.0f", (float)$l );
    494. return $h . $l;
    495. }
    496. // unpack 64-bit signed
    497. function sphUnpackI64 ( $v )
    498. {
    499. list ( $hi, $lo ) = array_values ( unpack ( "N*N*", $v ) );
    500. // x64
    501. if ( PHP_INT_SIZE>=8 )
    502. {
    503. if ( $hi<0 ) $hi += (1<<32); // because php 5.2.2 to 5.2.5 is totally fucked up again
    504. if ( $lo<0 ) $lo += (1<<32);
    505. return ($hi<<32) + $lo;
    506. }
    507. // x32, int
    508. if ( $hi==0 )
    509. {
    510. if ( $lo>0 )
    511. return $lo;
    512. return sprintf ( "%u", $lo );
    513. }
    514. // x32, int
    515. elseif ( $hi==-1 )
    516. {
    517. if ( $lo<0 )
    518. return $lo;
    519. return sprintf ( "%.0f", $lo - 4294967296.0 );
    520. }
    521. $neg = "";
    522. $c = 0;
    523. if ( $hi<0 )
    524. {
    525. $hi = ~$hi;
    526. $lo = ~$lo;
    527. $c = 1;
    528. $neg = "-";
    529. }
    530. $hi = sprintf ( "%u", $hi );
    531. $lo = sprintf ( "%u", $lo );
    532. // x32, bcmath
    533. if ( function_exists("bcmul") )
    534. return $neg . bcadd ( bcadd ( $lo, bcmul ( $hi, "4294967296" ) ), $c );
    535. // x32, no-bcmath
    536. $hi = (float)$hi;
    537. $lo = (float)$lo;
    538. $q = floor($hi/10000000.0);
    539. $r = $hi - $q*10000000.0;
    540. $m = $lo + $r*4967296.0;
    541. $mq = floor($m/10000000.0);
    542. $l = $m - $mq*10000000.0 + $c;
    543. $h = $q*4294967296.0 + $r*429.0 + $mq;
    544. $h = sprintf ( "%.0f", $h );
    545. $l = sprintf ( "%07.0f", $l );
    546. if ( $h=="0" )
    547. return $neg . sprintf( "%.0f", (float)$l );
    548. return $neg . $h . $l;
    549. }
    550. /// sphinx searchd client class
    551. class SphinxClient
    552. {
    553. var $_host; ///< searchd host (default is "localhost")
    554. var $_port; ///< searchd port (default is 3312)
    555. var $_offset; ///< how many records to seek from result-set start (default is 0)
    556. var $_limit; ///< how many records to return from result-set starting at offset (default is 20)
    557. var $_mode; ///< query matching mode (default is SPH_MATCH_ALL)
    558. var $_weights; ///< per-field weights (default is 1 for all fields)
    559. var $_sort; ///< match sorting mode (default is SPH_SORT_RELEVANCE)
    560. var $_sortby; ///< attribute to sort by (defualt is "")
    561. var $_min_id; ///< min ID to match (default is 0, which means no limit)
    562. var $_max_id; ///< max ID to match (default is 0, which means no limit)
    563. var $_filters; ///< search filters
    564. var $_groupby; ///< group-by attribute name
    565. var $_groupfunc; ///< group-by function (to pre-process group-by attribute value with)
    566. var $_groupsort; ///< group-by sorting clause (to sort groups in result set with)
    567. var $_groupdistinct;///< group-by count-distinct attribute
    568. var $_maxmatches; ///< max matches to retrieve
    569. var $_cutoff; ///< cutoff to stop searching at (default is 0)
    570. var $_retrycount; ///< distributed retries count
    571. var $_retrydelay; ///< distributed retries delay
    572. var $_anchor; ///< geographical anchor point
    573. var $_indexweights; ///< per-index weights
    574. var $_ranker; ///< ranking mode (default is SPH_RANK_PROXIMITY_BM25)
    575. var $_maxquerytime; ///< max query time, milliseconds (default is 0, do not limit)
    576. var $_fieldweights; ///< per-field-name weights
    577. var $_overrides; ///< per-query attribute values overrides
    578. var $_select; ///< select-list (attributes or expressions, with optional aliases)
    579. var $_error; ///< last error message
    580. var $_warning; ///< last warning message
    581. var $_connerror; ///< connection error vs remote error flag
    582. var $_reqs; ///< requests array for multi-query
    583. var $_mbenc; ///< stored mbstring encoding
    584. var $_arrayresult; ///< whether $result["matches"] should be a hash or an array
    585. var $_timeout; ///< connect timeout
    586. /////////////////////////////////////////////////////////////////////////////
    587. // common stuff
    588. /////////////////////////////////////////////////////////////////////////////
    589. /// create a new client object and fill defaults
    590. function SphinxClient ()
    591. {
    592. // per-client-object settings
    593. $this->_host = "localhost";
    594. $this->_port = 3312;
    595. $this->_path = false;
    596. $this->_socket = false;
    597. // per-query settings
    598. $this->_offset = 0;
    599. $this->_limit = 20;
    600. $this->_mode = SPH_MATCH_ALL;
    601. $this->_weights = array ();
    602. $this->_sort = SPH_SORT_RELEVANCE;
    603. $this->_sortby = "";
    604. $this->_min_id = 0;
    605. $this->_max_id = 0;
    606. $this->_filters = array ();
    607. $this->_groupby = "";
    608. $this->_groupfunc = SPH_GROUPBY_DAY;
    609. $this->_groupsort = "@group desc";
    610. $this->_groupdistinct= "";
    611. $this->_maxmatches = 1000;
    612. $this->_cutoff = 0;
    613. $this->_retrycount = 0;
    614. $this->_retrydelay = 0;
    615. $this->_anchor = array ();
    616. $this->_indexweights= array ();
    617. $this->_ranker = SPH_RANK_PROXIMITY_BM25;
    618. $this->_maxquerytime= 0;
    619. $this->_fieldweights= array();
    620. $this->_overrides = array();
    621. $this->_select = "*";
    622. $this->_error = ""; // per-reply fields (for single-query case)
    623. $this->_warning = "";
    624. $this->_connerror = false;
    625. $this->_reqs = array (); // requests storage (for multi-query case)
    626. $this->_mbenc = "";
    627. $this->_arrayresult = false;
    628. $this->_timeout = 0;
    629. }
    630. function __destruct()
    631. {
    632. if ( $this->_socket !== false )
    633. fclose ( $this->_socket );
    634. }
    635. /// get last error message (string)
    636. function GetLastError ()
    637. {
    638. return $this->_error;
    639. }
    640. /// get last warning message (string)
    641. function GetLastWarning ()
    642. {
    643. return $this->_warning;
    644. }
    645. /// get last error flag (to tell network connection errors from searchd errors or broken responses)
    646. function IsConnectError()
    647. {
    648. return $this->_connerror;
    649. }
    650. /// set searchd host name (string) and port (integer)
    651. function SetServer ( $host, $port = 0 )
    652. {
    653. assert ( is_string($host) );
    654. if ( $host[0] == '/')
    655. {
    656. $this->_path = 'unix://' . $host;
    657. return;
    658. }
    659. if ( substr ( $host, 0, 7 )=="unix://" )
    660. {
    661. $this->_path = $host;
    662. return;
    663. }
    664. assert ( is_int($port) );
    665. $this->_host = $host;
    666. $this->_port = $port;
    667. $this->_path = '';
    668. }
    669. /// set server connection timeout (0 to remove)
    670. function SetConnectTimeout ( $timeout )
    671. {
    672. assert ( is_numeric($timeout) );
    673. $this->_timeout = $timeout;
    674. }
    675. function _Send ( $handle, $data, $length )
    676. {
    677. if ( feof($handle) || fwrite ( $handle, $data, $length ) !== $length )
    678. {
    679. $this->_error = 'connection unexpectedly closed (timed out?)';
    680. $this->_connerror = true;
    681. return false;
    682. }
    683. return true;
    684. }
    685. /////////////////////////////////////////////////////////////////////////////
    686. /// enter mbstring workaround mode
    687. function _MBPush ()
    688. {
    689. $this->_mbenc = "";
    690. if ( ini_get ( "mbstring.func_overload" ) & 2 )
    691. {
    692. $this->_mbenc = mb_internal_encoding();
    693. mb_internal_encoding ( "latin1" );
    694. }
    695. }
    696. /// leave mbstring workaround mode
    697. function _MBPop ()
    698. {
    699. if ( $this->_mbenc )
    700. mb_internal_encoding ( $this->_mbenc );
    701. }
    702. /// connect to searchd server
    703. function _Connect ()
    704. {
    705. if ( $this->_socket !== false )
    706. return $this->_socket;
    707. $errno = 0;
    708. $errstr = "";
    709. $this->_connerror = false;
    710. if ( $this->_path )
    711. {
    712. $host = $this->_path;
    713. $port = 0;
    714. }
    715. else
    716. {
    717. $host = $this->_host;
    718. $port = $this->_port;
    719. }
    720. if ( $this->_timeout<=0 )
    721. $fp = @fsockopen ( $host, $port, $errno, $errstr );
    722. else
    723. $fp = @fsockopen ( $host, $port, $errno, $errstr, $this->_timeout );
    724. if ( !$fp )
    725. {
    726. if ( $this->_path )
    727. $location = $this->_path;
    728. else
    729. $location = "{$this->_host}:{$this->_port}";
    730. $errstr = trim ( $errstr );
    731. $this->_error = "connection to $location failed (errno=$errno, msg=$errstr)";
    732. $this->_connerror = true;
    733. return false;
    734. }
    735. // check version
    736. list(,$v) = unpack ( "N*", fread ( $fp, 4 ) );
    737. $v = (int)$v;
    738. if ( $v<1 )
    739. {
    740. fclose ( $fp );
    741. $this->_error = "expected searchd protocol version 1+, got version '$v'";
    742. return false;
    743. }
    744. // all ok, send my version
    745. if ( !$this->_Send ( $fp, pack ( "N", 1 ), 4 ) )
    746. return false;
    747. return $fp;
    748. }
    749. /// get and check response packet from searchd server
    750. function _GetResponse ( $fp, $client_ver )
    751. {
    752. $response = "";
    753. $len = 0;
    754. $header = fread ( $fp, 8 );
    755. if ( strlen($header)==8 )
    756. {
    757. list ( $status, $ver, $len ) = array_values ( unpack ( "n2a/Nb", $header ) );
    758. $left = $len;
    759. while ( $left>0 && !feof($fp) )
    760. {
    761. $chunk = fread ( $fp, $left );
    762. if ( $chunk )
    763. {
    764. $response .= $chunk;
    765. $left -= strlen($chunk);
    766. }
    767. }
    768. }
    769. if ( $this->_socket === false )
    770. fclose ( $fp );
    771. // check response
    772. $read = strlen ( $response );
    773. if ( !$response || $read!=$len )
    774. {
    775. $this->_error = $len
    776. ? "failed to read searchd response (status=$status, ver=$ver, len=$len, read=$read)"
    777. : "received zero-sized searchd response";
    778. return false;
    779. }
    780. // check status
    781. if ( $status==SEARCHD_WARNING )
    782. {
    783. list(,$wlen) = unpack ( "N*", substr ( $response, 0, 4 ) );
    784. $this->_warning = substr ( $response, 4, $wlen );
    785. return substr ( $response, 4+$wlen );
    786. }
    787. if ( $status==SEARCHD_ERROR )
    788. {
    789. $this->_error = "searchd error: " . substr ( $response, 4 );
    790. return false;
    791. }
    792. if ( $status==SEARCHD_RETRY )
    793. {
    794. $this->_error = "temporary searchd error: " . substr ( $response, 4 );
    795. return false;
    796. }
    797. if ( $status!=SEARCHD_OK )
    798. {
    799. $this->_error = "unknown status code '$status'";
    800. return false;
    801. }
    802. // check version
    803. if ( $ver<$client_ver )
    804. {
    805. $this->_warning = sprintf ( "searchd command v.%d.%d older than client's v.%d.%d, some options might not work",
    806. $ver>>8, $ver&0xff, $client_ver>>8, $client_ver&0xff );
    807. }
    808. return $response;
    809. }
    810. /////////////////////////////////////////////////////////////////////////////
    811. // searching
    812. /////////////////////////////////////////////////////////////////////////////
    813. /// set offset and count into result set,
    814. /// and optionally set max-matches and cutoff limits
    815. function SetLimits ( $offset, $limit, $max=0, $cutoff=0 )
    816. {
    817. assert ( is_int($offset) );
    818. assert ( is_int($limit) );
    819. assert ( $offset>=0 );
    820. assert ( $limit>0 );
    821. assert ( $max>=0 );
    822. $this->_offset = $offset;
    823. $this->_limit = $limit;
    824. if ( $max>0 )
    825. $this->_maxmatches = $max;
    826. if ( $cutoff>0 )
    827. $this->_cutoff = $cutoff;
    828. }
    829. /// set maximum query time, in milliseconds, per-index
    830. /// integer, 0 means "do not limit"
    831. function SetMaxQueryTime ( $max )
    832. {
    833. assert ( is_int($max) );
    834. assert ( $max>=0 );
    835. $this->_maxquerytime = $max;
    836. }
    837. /// set matching mode
    838. function SetMatchMode ( $mode )
    839. {
    840. assert ( $mode==SPH_MATCH_ALL
    841. || $mode==SPH_MATCH_ANY
    842. || $mode==SPH_MATCH_PHRASE
    843. || $mode==SPH_MATCH_BOOLEAN
    844. || $mode==SPH_MATCH_EXTENDED
    845. || $mode==SPH_MATCH_FULLSCAN
    846. || $mode==SPH_MATCH_EXTENDED2 );
    847. $this->_mode = $mode;
    848. }
    849. /// set ranking mode
    850. function SetRankingMode ( $ranker )
    851. {
    852. assert ( $ranker==SPH_RANK_PROXIMITY_BM25
    853. || $ranker==SPH_RANK_BM25
    854. || $ranker==SPH_RANK_NONE
    855. || $ranker==SPH_RANK_WORDCOUNT
    856. || $ranker==SPH_RANK_PROXIMITY );
    857. $this->_ranker = $ranker;
    858. }
    859. /// set matches sorting mode
    860. function SetSortMode ( $mode, $sortby="" )
    861. {
    862. assert (
    863. $mode==SPH_SORT_RELEVANCE ||
    864. $mode==SPH_SORT_ATTR_DESC ||
    865. $mode==SPH_SORT_ATTR_ASC ||
    866. $mode==SPH_SORT_TIME_SEGMENTS ||
    867. $mode==SPH_SORT_EXTENDED ||
    868. $mode==SPH_SORT_EXPR );
    869. assert ( is_string($sortby) );
    870. assert ( $mode==SPH_SORT_RELEVANCE || strlen($sortby)>0 );
    871. $this->_sort = $mode;
    872. $this->_sortby = $sortby;
    873. }
    874. /// bind per-field weights by order
    875. /// DEPRECATED; use SetFieldWeights() instead
    876. function SetWeights ( $weights )
    877. {
    878. assert ( is_array($weights) );
    879. foreach ( $weights as $weight )
    880. assert ( is_int($weight) );
    881. $this->_weights = $weights;
    882. }
    883. /// bind per-field weights by name
    884. function SetFieldWeights ( $weights )
    885. {
    886. assert ( is_array($weights) );
    887. foreach ( $weights as $name=>$weight )
    888. {
    889. assert ( is_string($name) );
    890. assert ( is_int($weight) );
    891. }
    892. $this->_fieldweights = $weights;
    893. }
    894. /// bind per-index weights by name
    895. function SetIndexWeights ( $weights )
    896. {
    897. assert ( is_array($weights) );
    898. foreach ( $weights as $index=>$weight )
    899. {
    900. assert ( is_string($index) );
    901. assert ( is_int($weight) );
    902. }
    903. $this->_indexweights = $weights;
    904. }
    905. /// set IDs range to match
    906. /// only match records if document ID is beetwen $min and $max (inclusive)
    907. function SetIDRange ( $min, $max )
    908. {
    909. assert ( is_numeric($min) );
    910. assert ( is_numeric($max) );
    911. assert ( $min<=$max );
    912. $this->_min_id = $min;
    913. $this->_max_id = $max;
    914. }
    915. /// set values set filter
    916. /// only match records where $attribute value is in given set
    917. function SetFilter ( $attribute, $values, $exclude=false )
    918. {
    919. assert ( is_string($attribute) );
    920. assert ( is_array($values) );
    921. assert ( count($values) );
    922. if ( is_array($values) && count($values) )
    923. {
    924. foreach ( $values as $value )
    925. assert ( is_numeric($value) );
    926. $this->_filters[] = array ( "type"=>SPH_FILTER_VALUES, "attr"=>$attribute, "exclude"=>$exclude, "values"=>$values );
    927. }
    928. }
    929. /// set range filter
    930. /// only match records if $attribute value is beetwen $min and $max (inclusive)
    931. function SetFilterRange ( $attribute, $min, $max, $exclude=false )
    932. {
    933. assert ( is_string($attribute) );
    934. assert ( is_numeric($min) );
    935. assert ( is_numeric($max) );
    936. assert ( $min<=$max );
    937. $this->_filters[] = array ( "type"=>SPH_FILTER_RANGE, "attr"=>$attribute, "exclude"=>$exclude, "min"=>$min, "max"=>$max );
    938. }
    939. /// set float range filter
    940. /// only match records if $attribute value is beetwen $min and $max (inclusive)
    941. function SetFilterFloatRange ( $attribute, $min, $max, $exclude=false )
    942. {
    943. assert ( is_string($attribute) );
    944. assert ( is_float($min) );
    945. assert ( is_float($max) );
    946. assert ( $min<=$max );
    947. $this->_filters[] = array ( "type"=>SPH_FILTER_FLOATRANGE, "attr"=>$attribute, "exclude"=>$exclude, "min"=>$min, "max"=>$max );
    948. }
    949. /// setup anchor point for geosphere distance calculations
    950. /// required to use @geodist in filters and sorting
    951. /// latitude and longitude must be in radians
    952. function SetGeoAnchor ( $attrlat, $attrlong, $lat, $long )
    953. {
    954. assert ( is_string($attrlat) );
    955. assert ( is_string($attrlong) );
    956. assert ( is_float($lat) );
    957. assert ( is_float($long) );
    958. $this->_anchor = array ( "attrlat"=>$attrlat, "attrlong"=>$attrlong, "lat"=>$lat, "long"=>$long );
    959. }
    960. /// set grouping attribute and function
    961. function SetGroupBy ( $attribute, $func, $groupsort="@group desc" )
    962. {
    963. assert ( is_string($attribute) );
    964. assert ( is_string($groupsort) );
    965. assert ( $func==SPH_GROUPBY_DAY
    966. || $func==SPH_GROUPBY_WEEK
    967. || $func==SPH_GROUPBY_MONTH
    968. || $func==SPH_GROUPBY_YEAR
    969. || $func==SPH_GROUPBY_ATTR
    970. || $func==SPH_GROUPBY_ATTRPAIR );
    971. $this->_groupby = $attribute;
    972. $this->_groupfunc = $func;
    973. $this->_groupsort = $groupsort;
    974. }
    975. /// set count-distinct attribute for group-by queries
    976. function SetGroupDistinct ( $attribute )
    977. {
    978. assert ( is_string($attribute) );
    979. $this->_groupdistinct = $attribute;
    980. }
    981. /// set distributed retries count and delay
    982. function SetRetries ( $count, $delay=0 )
    983. {
    984. assert ( is_int($count) && $count>=0 );
    985. assert ( is_int($delay) && $delay>=0 );
    986. $this->_retrycount = $count;
    987. $this->_retrydelay = $delay;
    988. }
    989. /// set result set format (hash or array; hash by default)
    990. /// PHP specific; needed for group-by-MVA result sets that may contain duplicate IDs
    991. function SetArrayResult ( $arrayresult )
    992. {
    993. assert ( is_bool($arrayresult) );
    994. $this->_arrayresult = $arrayresult;
    995. }
    996. /// set attribute values override
    997. /// there can be only one override per attribute
    998. /// $values must be a hash that maps document IDs to attribute values
    999. function SetOverride ( $attrname, $attrtype, $values )
    1000. {
    1001. assert ( is_string ( $attrname ) );
    1002. assert ( in_array ( $attrtype, array ( SPH_ATTR_INTEGER, SPH_ATTR_TIMESTAMP, SPH_ATTR_BOOL, SPH_ATTR_FLOAT, SPH_ATTR_BIGINT ) ) );
    1003. assert ( is_array ( $values ) );
    1004. $this->_overrides[$attrname] = array ( "attr"=>$attrname, "type"=>$attrtype, "values"=>$values );
    1005. }
    1006. /// set select-list (attributes or expressions), SQL-like syntax
    1007. function SetSelect ( $select )
    1008. {
    1009. assert ( is_string ( $select ) );
    1010. $this->_select = $select;
    1011. }
    1012. //////////////////////////////////////////////////////////////////////////////
    1013. /// clear all filters (for multi-queries)
    1014. function ResetFilters ()
    1015. {
    1016. $this->_filters = array();
    1017. $this->_anchor = array();
    1018. }
    1019. /// clear groupby settings (for multi-queries)
    1020. function ResetGroupBy ()
    1021. {
    1022. $this->_groupby = "";
    1023. $this->_groupfunc = SPH_GROUPBY_DAY;
    1024. $this->_groupsort = "@group desc";
    1025. $this->_groupdistinct= "";
    1026. }
    1027. /// clear all attribute value overrides (for multi-queries)
    1028. function ResetOverrides ()
    1029. {
    1030. $this->_overrides = array ();
    1031. }
    1032. //////////////////////////////////////////////////////////////////////////////
    1033. /// connect to searchd server, run given search query through given indexes,
    1034. /// and return the search results
    1035. function Query ( $query, $index="*", $comment="" )
    1036. {
    1037. assert ( empty($this->_reqs) );
    1038. $this->AddQuery ( $query, $index, $comment );
    1039. $results = $this->RunQueries ();
    1040. $this->_reqs = array (); // just in case it failed too early
    1041. if ( !is_array($results) )
    1042. return false; // probably network error; error message should be already filled
    1043. $this->_error = $results[0]["error"];
    1044. $this->_warning = $results[0]["warning"];
    1045. if ( $results[0]["status"]==SEARCHD_ERROR )
    1046. return false;
    1047. else
    1048. return $results[0];
    1049. }
    1050. /// helper to pack floats in network byte order
    1051. function _PackFloat ( $f )
    1052. {
    1053. $t1 = pack ( "f", $f ); // machine order
    1054. list(,$t2) = unpack ( "L*", $t1 ); // int in machine order
    1055. return pack ( "N", $t2 );
    1056. }
    1057. /// add query to multi-query batch
    1058. /// returns index into results array from RunQueries() call
    1059. function AddQuery ( $query, $index="*", $comment="" )
    1060. {
    1061. // mbstring workaround
    1062. $this->_MBPush ();
    1063. // build request
    1064. $req = pack ( "NNNNN", $this->_offset, $this->_limit, $this->_mode, $this->_ranker, $this->_sort ); // mode and limits
    1065. $req .= pack ( "N", strlen($this->_sortby) ) . $this->_sortby;
    1066. $req .= pack ( "N", strlen($query) ) . $query; // query itself
    1067. $req .= pack ( "N", count($this->_weights) ); // weights
    1068. foreach ( $this->_weights as $weight )
    1069. $req .= pack ( "N", (int)$weight );
    1070. $req .= pack ( "N", strlen($index) ) . $index; // indexes
    1071. $req .= pack ( "N", 1 ); // id64 range marker
    1072. $req .= sphPackU64 ( $this->_min_id ) . sphPackU64 ( $this->_max_id ); // id64 range
    1073. // filters
    1074. $req .= pack ( "N", count($this->_filters) );
    1075. foreach ( $this->_filters as $filter )
    1076. {
    1077. $req .= pack ( "N", strlen($filter["attr"]) ) . $filter["attr"];
    1078. $req .= pack ( "N", $filter["type"] );
    1079. switch ( $filter["type"] )
    1080. {
    1081. case SPH_FILTER_VALUES:
    1082. $req .= pack ( "N", count($filter["values"]) );
    1083. foreach ( $filter["values"] as $value )
    1084. $req .= sphPackI64 ( $value );
    1085. break;
    1086. case SPH_FILTER_RANGE:
    1087. $req .= sphPackI64 ( $filter["min"] ) . sphPackI64 ( $filter["max"] );
    1088. break;
    1089. case SPH_FILTER_FLOATRANGE:
    1090. $req .= $this->_PackFloat ( $filter["min"] ) . $this->_PackFloat ( $filter["max"] );
    1091. break;
    1092. default:
    1093. assert ( 0 && "internal error: unhandled filter type" );
    1094. }
    1095. $req .= pack ( "N", $filter["exclude"] );
    1096. }
    1097. // group-by clause, max-matches count, group-sort clause, cutoff count
    1098. $req .= pack ( "NN", $this->_groupfunc, strlen($this->_groupby) ) . $this->_groupby;
    1099. $req .= pack ( "N", $this->_maxmatches );
    1100. $req .= pack ( "N", strlen($this->_groupsort) ) . $this->_groupsort;
    1101. $req .= pack ( "NNN", $this->_cutoff, $this->_retrycount, $this->_retrydelay );
    1102. $req .= pack ( "N", strlen($this->_groupdistinct) ) . $this->_groupdistinct;
    1103. // anchor point
    1104. if ( empty($this->_anchor) )
    1105. {
    1106. $req .= pack ( "N", 0 );
    1107. } else
    1108. {
    1109. $a =& $this->_anchor;
    1110. $req .= pack ( "N", 1 );
    1111. $req .= pack ( "N", strlen($a["attrlat"]) ) . $a["attrlat"];
    1112. $req .= pack ( "N", strlen($a["attrlong"]) ) . $a["attrlong"];
    1113. $req .= $this->_PackFloat ( $a["lat"] ) . $this->_PackFloat ( $a["long"] );
    1114. }
    1115. // per-index weights
    1116. $req .= pack ( "N", count($this->_indexweights) );
    1117. foreach ( $this->_indexweights as $idx=>$weight )
    1118. $req .= pack ( "N", strlen($idx) ) . $idx . pack ( "N", $weight );
    1119. // max query time
    1120. $req .= pack ( "N", $this->_maxquerytime );
    1121. // per-field weights
    1122. $req .= pack ( "N", count($this->_fieldweights) );
    1123. foreach ( $this->_fieldweights as $field=>$weight )
    1124. $req .= pack ( "N", strlen($field) ) . $field . pack ( "N", $weight );
    1125. // comment
    1126. $req .= pack ( "N", strlen($comment) ) . $comment;
    1127. // attribute overrides
    1128. $req .= pack ( "N", count($this->_overrides) );
    1129. foreach ( $this->_overrides as $key => $entry )
    1130. {
    1131. $req .= pack ( "N", strlen($entry["attr"]) ) . $entry["attr"];
    1132. $req .= pack ( "NN", $entry["type"], count($entry["values"]) );
    1133. foreach ( $entry["values"] as $id=>$val )
    1134. {
    1135. assert ( is_numeric($id) );
    1136. assert ( is_numeric($val) );
    1137. $req .= sphPackU64 ( $id );
    1138. switch ( $entry["type"] )
    1139. {
    1140. case SPH_ATTR_FLOAT: $req .= $this->_PackFloat ( $val ); break;
    1141. case SPH_ATTR_BIGINT: $req .= sphPackI64 ( $val ); break;
    1142. default: $req .= pack ( "N", $val ); break;
    1143. }
    1144. }
    1145. }
    1146. // select-list
    1147. $req .= pack ( "N", strlen($this->_select) ) . $this->_select;
    1148. // mbstring workaround
    1149. $this->_MBPop ();
    1150. // store request to requests array
    1151. $this->_reqs[] = $req;
    1152. return count($this->_reqs)-1;
    1153. }
    1154. /// connect to searchd, run queries batch, and return an array of result sets
    1155. function RunQueries ()
    1156. {
    1157. if ( empty($this->_reqs) )
    1158. {
    1159. $this->_error = "no queries defined, issue AddQuery() first";
    1160. return false;
    1161. }
    1162. // mbstring workaround
    1163. $this->_MBPush ();
    1164. if (!( $fp = $this->_Connect() ))
    1165. {
    1166. $this->_MBPop ();
    1167. return false;
    1168. }
    1169. ////////////////////////////
    1170. // send query, get response
    1171. ////////////////////////////
    1172. $nreqs = count($this->_reqs);
    1173. $req = join ( "", $this->_reqs );
    1174. $len = 4+strlen($req);
    1175. $req = pack ( "nnNN", SEARCHD_COMMAND_SEARCH, VER_COMMAND_SEARCH, $len, $nreqs ) . $req; // add header
    1176. if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
    1177. !( $response = $this->_GetResponse ( $fp, VER_COMMAND_SEARCH ) ) )
    1178. {
    1179. $this->_MBPop ();
    1180. return false;
    1181. }
    1182. $this->_reqs = array ();
    1183. //////////////////
    1184. // parse response
    1185. //////////////////
    1186. $p = 0; // current position
    1187. $max = strlen($response); // max position for checks, to protect against broken responses
    1188. $results = array ();
    1189. for ( $ires=0; $ires<$nreqs && $p<$max; $ires++ )
    1190. {
    1191. $results[] = array();
    1192. $result =& $results[$ires];
    1193. $result["error"] = "";
    1194. $result["warning"] = "";
    1195. // extract status
    1196. list(,$status) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1197. $result["status"] = $status;
    1198. if ( $status!=SEARCHD_OK )
    1199. {
    1200. list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1201. $message = substr ( $response, $p, $len ); $p += $len;
    1202. if ( $status==SEARCHD_WARNING )
    1203. {
    1204. $result["warning"] = $message;
    1205. } else
    1206. {
    1207. $result["error"] = $message;
    1208. continue;
    1209. }
    1210. }
    1211. // read schema
    1212. $fields = array ();
    1213. $attrs = array ();
    1214. list(,$nfields) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1215. while ( $nfields-->0 && $p<$max )
    1216. {
    1217. list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1218. $fields[] = substr ( $response, $p, $len ); $p += $len;
    1219. }
    1220. $result["fields"] = $fields;
    1221. list(,$nattrs) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1222. while ( $nattrs-->0 && $p<$max )
    1223. {
    1224. list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1225. $attr = substr ( $response, $p, $len ); $p += $len;
    1226. list(,$type) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1227. $attrs[$attr] = $type;
    1228. }
    1229. $result["attrs"] = $attrs;
    1230. // read match count
    1231. list(,$count) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1232. list(,$id64) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1233. // read matches
    1234. $idx = -1;
    1235. while ( $count-->0 && $p<$max )
    1236. {
    1237. // index into result array
    1238. $idx++;
    1239. // parse document id and weight
    1240. if ( $id64 )
    1241. {
    1242. $doc = sphUnpackU64 ( substr ( $response, $p, 8 ) ); $p += 8;
    1243. list(,$weight) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1244. }
    1245. else
    1246. {
    1247. list ( $doc, $weight ) = array_values ( unpack ( "N*N*",
    1248. substr ( $response, $p, 8 ) ) );
    1249. $p += 8;
    1250. if ( PHP_INT_SIZE>=8 )
    1251. {
    1252. // x64 route, workaround broken unpack() in 5.2.2+
    1253. if ( $doc<0 ) $doc += (1<<32);
    1254. } else
    1255. {
    1256. // x32 route, workaround php signed/unsigned braindamage
    1257. $doc = sprintf ( "%u", $doc );
    1258. }
    1259. }
    1260. $weight = sprintf ( "%u", $weight );
    1261. // create match entry
    1262. if ( $this->_arrayresult )
    1263. $result["matches"][$idx] = array ( "id"=>$doc, "weight"=>$weight );
    1264. else
    1265. $result["matches"][$doc]["weight"] = $weight;
    1266. // parse and create attributes
    1267. $attrvals = array ();
    1268. foreach ( $attrs as $attr=>$type )
    1269. {
    1270. // handle 64bit ints
    1271. if ( $type==SPH_ATTR_BIGINT )
    1272. {
    1273. $attrvals[$attr] = sphUnpackI64 ( substr ( $response, $p, 8 ) ); $p += 8;
    1274. continue;
    1275. }
    1276. // handle floats
    1277. if ( $type==SPH_ATTR_FLOAT )
    1278. {
    1279. list(,$uval) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1280. list(,$fval) = unpack ( "f*", pack ( "L", $uval ) );
    1281. $attrvals[$attr] = $fval;
    1282. continue;
    1283. }
    1284. // handle everything else as unsigned ints
    1285. list(,$val) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1286. if ( $type & SPH_ATTR_MULTI )
    1287. {
    1288. $attrvals[$attr] = array ();
    1289. $nvalues = $val;
    1290. while ( $nvalues-->0 && $p<$max )
    1291. {
    1292. list(,$val) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1293. $attrvals[$attr][] = sprintf ( "%u", $val );
    1294. }
    1295. } else
    1296. {
    1297. $attrvals[$attr] = sprintf ( "%u", $val );
    1298. }
    1299. }
    1300. if ( $this->_arrayresult )
    1301. $result["matches"][$idx]["attrs"] = $attrvals;
    1302. else
    1303. $result["matches"][$doc]["attrs"] = $attrvals;
    1304. }
    1305. list ( $total, $total_found, $msecs, $words ) =
    1306. array_values ( unpack ( "N*N*N*N*", substr ( $response, $p, 16 ) ) );
    1307. $result["total"] = sprintf ( "%u", $total );
    1308. $result["total_found"] = sprintf ( "%u", $total_found );
    1309. $result["time"] = sprintf ( "%.3f", $msecs/1000 );
    1310. $p += 16;
    1311. while ( $words-->0 && $p<$max )
    1312. {
    1313. list(,$len) = unpack ( "N*", substr ( $response, $p, 4 ) ); $p += 4;
    1314. $word = substr ( $response, $p, $len ); $p += $len;
    1315. list ( $docs, $hits ) = array_values ( unpack ( "N*N*", substr ( $response, $p, 8 ) ) ); $p += 8;
    1316. $result["words"][$word] = array (
    1317. "docs"=>sprintf ( "%u", $docs ),
    1318. "hits"=>sprintf ( "%u", $hits ) );
    1319. }
    1320. }
    1321. $this->_MBPop ();
    1322. return $results;
    1323. }
    1324. /////////////////////////////////////////////////////////////////////////////
    1325. // excerpts generation
    1326. /////////////////////////////////////////////////////////////////////////////
    1327. /// connect to searchd server, and generate exceprts (snippets)
    1328. /// of given documents for given query. returns false on failure,
    1329. /// an array of snippets on success
    1330. function BuildExcerpts ( $docs, $index, $words, $opts=array() )
    1331. {
    1332. assert ( is_array($docs) );
    1333. assert ( is_string($index) );
    1334. assert ( is_string($words) );
    1335. assert ( is_array($opts) );
    1336. $this->_MBPush ();
    1337. if (!( $fp = $this->_Connect() ))
    1338. {
    1339. $this->_MBPop();
    1340. return false;
    1341. }
    1342. /////////////////
    1343. // fixup options
    1344. /////////////////
    1345. if ( !isset($opts["before_match"]) ) $opts["before_match"] = "<b>";
    1346. if ( !isset($opts["after_match"]) ) $opts["after_match"] = "</b>";
    1347. if ( !isset($opts["chunk_separator"]) ) $opts["chunk_separator"] = " ... ";
    1348. if ( !isset($opts["limit"]) ) $opts["limit"] = 256;
    1349. if ( !isset($opts["around"]) ) $opts["around"] = 5;
    1350. if ( !isset($opts["exact_phrase"]) ) $opts["exact_phrase"] = false;
    1351. if ( !isset($opts["single_passage"]) ) $opts["single_passage"] = false;
    1352. if ( !isset($opts["use_boundaries"]) ) $opts["use_boundaries"] = false;
    1353. if ( !isset($opts["weight_order"]) ) $opts["weight_order"] = false;
    1354. /////////////////
    1355. // build request
    1356. /////////////////
    1357. // v.1.0 req
    1358. $flags = 1; // remove spaces
    1359. if ( $opts["exact_phrase"] ) $flags |= 2;
    1360. if ( $opts["single_passage"] ) $flags |= 4;
    1361. if ( $opts["use_boundaries"] ) $flags |= 8;
    1362. if ( $opts["weight_order"] ) $flags |= 16;
    1363. $req = pack ( "NN", 0, $flags ); // mode=0, flags=$flags
    1364. $req .= pack ( "N", strlen($index) ) . $index; // req index
    1365. $req .= pack ( "N", strlen($words) ) . $words; // req words
    1366. // options
    1367. $req .= pack ( "N", strlen($opts["before_match"]) ) . $opts["before_match"];
    1368. $req .= pack ( "N", strlen($opts["after_match"]) ) . $opts["after_match"];
    1369. $req .= pack ( "N", strlen($opts["chunk_separator"]) ) . $opts["chunk_separator"];
    1370. $req .= pack ( "N", (int)$opts["limit"] );
    1371. $req .= pack ( "N", (int)$opts["around"] );
    1372. // documents
    1373. $req .= pack ( "N", count($docs) );
    1374. foreach ( $docs as $doc )
    1375. {
    1376. assert ( is_string($doc) );
    1377. $req .= pack ( "N", strlen($doc) ) . $doc;
    1378. }
    1379. ////////////////////////////
    1380. // send query, get response
    1381. ////////////////////////////
    1382. $len = strlen($req);
    1383. $req = pack ( "nnN", SEARCHD_COMMAND_EXCERPT, VER_COMMAND_EXCERPT, $len ) . $req; // add header
    1384. if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
    1385. !( $response = $this->_GetResponse ( $fp, VER_COMMAND_EXCERPT ) ) )
    1386. {
    1387. $this->_MBPop ();
    1388. return false;
    1389. }
    1390. //////////////////
    1391. // parse response
    1392. //////////////////
    1393. $pos = 0;
    1394. $res = array ();
    1395. $rlen = strlen($response);
    1396. for ( $i=0; $i<count($docs); $i++ )
    1397. {
    1398. list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) );
    1399. $pos += 4;
    1400. if ( $pos+$len > $rlen )
    1401. {
    1402. $this->_error = "incomplete reply";
    1403. $this->_MBPop ();
    1404. return false;
    1405. }
    1406. $res[] = $len ? substr ( $response, $pos, $len ) : "";
    1407. $pos += $len;
    1408. }
    1409. $this->_MBPop ();
    1410. return $res;
    1411. }
    1412. /////////////////////////////////////////////////////////////////////////////
    1413. // keyword generation
    1414. /////////////////////////////////////////////////////////////////////////////
    1415. /// connect to searchd server, and generate keyword list for a given query
    1416. /// returns false on failure,
    1417. /// an array of words on success
    1418. function BuildKeywords ( $query, $index, $hits )
    1419. {
    1420. assert ( is_string($query) );
    1421. assert ( is_string($index) );
    1422. assert ( is_bool($hits) );
    1423. $this->_MBPush ();
    1424. if (!( $fp = $this->_Connect() ))
    1425. {
    1426. $this->_MBPop();
    1427. return false;
    1428. }
    1429. /////////////////
    1430. // build request
    1431. /////////////////
    1432. // v.1.0 req
    1433. $req = pack ( "N", strlen($query) ) . $query; // req query
    1434. $req .= pack ( "N", strlen($index) ) . $index; // req index
    1435. $req .= pack ( "N", (int)$hits );
    1436. ////////////////////////////
    1437. // send query, get response
    1438. ////////////////////////////
    1439. $len = strlen($req);
    1440. $req = pack ( "nnN", SEARCHD_COMMAND_KEYWORDS, VER_COMMAND_KEYWORDS, $len ) . $req; // add header
    1441. if ( !( $this->_Send ( $fp, $req, $len+8 ) ) ||
    1442. !( $response = $this->_GetResponse ( $fp, VER_COMMAND_KEYWORDS ) ) )
    1443. {
    1444. $this->_MBPop ();
    1445. return false;
    1446. }
    1447. //////////////////
    1448. // parse response
    1449. //////////////////
    1450. $pos = 0;
    1451. $res = array ();
    1452. $rlen = strlen($response);
    1453. list(,$nwords) = unpack ( "N*", substr ( $response, $pos, 4 ) );
    1454. $pos += 4;
    1455. for ( $i=0; $i<$nwords; $i++ )
    1456. {
    1457. list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) ); $pos += 4;
    1458. $tokenized = $len ? substr ( $response, $pos, $len ) : "";
    1459. $pos += $len;
    1460. list(,$len) = unpack ( "N*", substr ( $response, $pos, 4 ) ); $pos += 4;
    1461. $normalized = $len ? substr ( $response, $pos, $len ) : "";
    1462. $pos += $len;
    1463. $res[] = array ( "tokenized"=>$tokenized, "normalized"=>$normalized );
    1464. if ( $hits )
    1465. {
    1466. list($ndocs,$nhits) = array_values ( unpack ( "N*N*", substr ( $response, $pos, 8 ) ) );
    1467. $pos += 8;
    1468. $res [$i]["docs"] = $ndocs;
    1469. $res [$i]["hits"] = $nhits;
    1470. }
    1471. if ( $pos > $rlen )
    1472. {
    1473. $this->_error = "incomplete reply";
    1474. $this->_MBPop ();
    1475. return false;
    1476. }
    1477. }
    1478. $this->_MBPop ();
    1479. return $res;
    1480. }
    1481. function EscapeString ( $string )
    1482. {
    1483. $from = array ( '(',')','|','-','!','@','~','"','&', '/', '\\' );
    1484. $to = array ( '\(','\)','\|','\-','\!','\@','\~','\"', '\&', '\/', '\\\\' );
    1485. return str_replace ( $from, $to, $string );
    1486. }
    1487. /////////////////////////////////////////////////////////////////////////////
    1488. // attribute updates
    1489. /////////////////////////////////////////////////////////////////////////////
    1490. /// batch update given attributes in given rows in given indexes
    1491. /// returns amount of updated documents (0 or more) on success, or -1 on failure
    1492. function UpdateAttributes ( $index, $attrs, $values, $mva=false )
    1493. {
    1494. // verify everything
    1495. assert ( is_string($index) );
    1496. assert ( is_bool($mva) );
    1497. assert ( is_array($attrs) );
    1498. foreach ( $attrs as $attr )
    1499. assert ( is_string($attr) );
    1500. assert ( is_array($values) );
    1501. foreach ( $values as $id=>$entry )
    1502. {
    1503. assert ( is_numeric($id) );
    1504. assert ( is_array($entry) );
    1505. assert ( count($entry)==count($attrs) );
    1506. foreach ( $entry as $v )
    1507. {
    1508. if ( $mva )
    1509. {
    1510. assert ( is_array($v) );
    1511. foreach ( $v as $vv )
    1512. assert ( is_int($vv) );
    1513. } else
    1514. assert ( is_int($v) );
    1515. }
    1516. }
    1517. // build request
    1518. $req = pack ( "N", strlen($index) ) . $index;
    1519. $req .= pack ( "N", count($attrs) );
    1520. foreach ( $attrs as $attr )
    1521. {
    1522. $req .= pack ( "N", strlen($attr) ) . $attr;
    1523. $req .= pack ( "N", $mva ? 1 : 0 );
    1524. }
    1525. $req .= pack ( "N", count($values) );
    1526. foreach ( $values as $id=>$entry )
    1527. {
    1528. $req .= sphPackU64 ( $id );
    1529. foreach ( $entry as $v )
    1530. {
    1531. $req .= pack ( "N", $mva ? count($v) : $v );
    1532. if ( $mva )
    1533. foreach ( $v as $vv )
    1534. $req .= pack ( "N", $vv );
    1535. }
    1536. }
    1537. // connect, send query, get response
    1538. if (!( $fp = $this->_Connect() ))
    1539. return -1;
    1540. $len = strlen($req);
    1541. $req = pack ( "nnN", SEARCHD_COMMAND_UPDATE, VER_COMMAND_UPDATE, $len ) . $req; // add header
    1542. if ( !$this->_Send ( $fp, $req, $len+8 ) )
    1543. return -1;
    1544. if (!( $response = $this->_GetResponse ( $fp, VER_COMMAND_UPDATE ) ))
    1545. return -1;
    1546. // parse response
    1547. list(,$updated) = unpack ( "N*", substr ( $response, 0, 4 ) );
    1548. return $updated;
    1549. }
    1550. /////////////////////////////////////////////////////////////////////////////
    1551. // persistent connections
    1552. /////////////////////////////////////////////////////////////////////////////
    1553. function Open()
    1554. {
    1555. if ( $this->_socket !== false )
    1556. {
    1557. $this->_error = 'already connected';
    1558. return false;
    1559. }
    1560. if ( !$fp = $this->_Connect() )
    1561. return false;
    1562. // command, command version = 0, body length = 4, body = 1
    1563. $req = pack ( "nnNN", SEARCHD_COMMAND_PERSIST, 0, 4, 1 );
    1564. if ( !$this->_Send ( $fp, $req, 12 ) )
    1565. return false;
    1566. $this->_socket = $fp;
    1567. return true;
    1568. }
    1569. function Close()
    1570. {
    1571. if ( $this->_socket === false )
    1572. {
    1573. $this->_error = 'not connected';
    1574. return false;
    1575. }
    1576. fclose ( $this->_socket );
    1577. $this->_socket = false;
    1578. return true;
    1579. }
    1580. }
    1581. //
    1582. // $Id: sphinxapi.php 1566 2008-11-17 19:06:44Z shodan $
    1583. //
    1584. ?>