In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "how php can not load scws", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how php can't load scws.
This article operating environment: windows7 system, PHP5.4 version, Dell G3 computer.
What if php cannot load scws? Installation and use example of open source php Chinese word segmentation system SCWS
A brief introduction to SCWS
SCWS is the acronym of Simple Chinese Word Segmentation (simple Chinese word segmentation system).
This is a mechanical Chinese word segmentation engine based on word frequency dictionary, which can segment a whole paragraph of Chinese text into words correctly. Words are the smallest morpheme unit in Chinese, but they are not separated by spaces in writing as in English, so how to segment words accurately and quickly has always been a difficult problem in Chinese word segmentation.
SCWS is developed in pure C language, does not rely on any external library functions, can directly use dynamic link libraries to embed applications, and supports Chinese codes such as GBK, UTF-8 and so on. In addition, the PHP expansion module is also provided, which can quickly and easily use the word segmentation function in PHP.
There are not many innovative components in the word segmentation algorithm. We use the word frequency dictionary collected by ourselves, supplemented by certain proper names, person names, place names, digital age and other rules to achieve basic word segmentation. after small-scale testing, the accuracy rate is between 90% and 95%, which can basically meet the needs of some small search engines, keyword extraction and other occasions. The first prototype version was released at the end of 2005.
SCWS is developed by hightman and released as an open source under the BSD license agreement. The source code is hosted on github.
II. Scws installation
The code is as follows:
# wget-c http://www.xunsearch.com/scws/down/scws-1.2.1.tar.bz2# tar jxvf scws-1.2.1.tar.bz2# cd scws-1.2.1#. / configure-- prefix=/usr/local/scws# make & & make install
III. Installation of PHP extension for scws
The code is as follows:
# cd. / phpext# phpize#. / configure-- with-php-config=/usr/local/php5410/bin/php-config# make & & make install# echo "[scws]" > > / usr/local/php5410/etc/php.ini# echo "extension = scws.so" > > / usr/local/php5410/etc/php.ini# echo "scws.default.charset = utf-8" > > / usr/local/php5410/etc/php.ini# echo "scws.default.fpath = / usr/local/scws / etc/ "> > / usr/local/php5410/etc/php.ini
IV. Installation of thesaurus
The code is as follows:
# wget http://www.xunsearch.com/scws/down/scws-dict-chs-utf8.tar.bz2# tar jxvf scws-dict-chs-utf8.tar.bz2-C / usr/local/scws/etc/# chown www:www / usr/local/scws/etc/dict.utf8.xdb
Fifth, php instance code. You can take a detailed look at the official API description of SCWS.
The code is as follows:
/ / instantiate the core class of the word segmentation plug-in $so = scws_new (); / / set the code for word segmentation $so- > set_charset ('utf-8'); / / set the dictionary for word segmentation (utf8's dictionary is used here) $so- > set_dict (' / usr/local/scws/etc/dict.utf8.xdb'); / / set the rule for word segmentation $so- > set_rule ('/ usr/local/scws/etc/rules.utf8.ini') / / remove the punctuation mark $so- > set_ignore (true) before word segmentation; / / whether or not compound segmentation, such as "Chinese" returns the words "Chinese + people + Chinese". $so- > set_multi (true); / / sets the text to automatically aggregate $so- > set_duality (true) in two-character segmentation; / / sentences for word segmentation $so- > send_text ("Welcome to Martian IT Development"); / / get the result of word segmentation, if you extract high-frequency words using the get_tops method while ($tmp = $so- > get_result ()) {print_r ($tmp);} $so- > close ()
Returns the array result description:
The code is as follows:
Word _ string_ word itself idf _ float_ inverse text word frequency off _ int_ the position of the word in the original text attr _ string_ part of speech
At this point, I believe you have a deeper understanding of "how php can not load scws". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.