One problem Solr numeric characters can not be searched

问题一: 测试人员告诉我数字不能被搜索。于是开始找原因:

<fields>
***
<field name="productName" type="text" indexed="true" stored="true" />
***
</fields>

fieldType text配置:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
   <analyzer type="index">
  <tokenizer class="solr.LowerCaseTokenizerFactory"/>
  <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="50" side="front"/>
   </analyzer>
   <analyzer type="query">
  <tokenizer class="solr.LowerCaseTokenizerFactory"/>
  <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="50" side="front"/>
   </analyzer>
</fieldType>

When my productName contains numeric characters. For example, there is the name of a product called 'quack Karma 123' then use the digital 1/2/3/12 and so can not be searched

at the time when the '123 quack Karma' is the same. For a long time did not know the cause. I do not know how to find the reason. Then he asked, fuel injection. I guess the word is a problem. So watching to see Solr management interface can be found Diansha?

QQ group finally a buddy said solr.LowerCaseTokenizerFactory will filter out to see a demonstration of digital word can be positive for the current configuration in Solr schema.xml the Analysis menu. You can also select the appropriate field to try this guy really is LowerCaseTokenizerFactory problem. So look for alternatives. After trying to search. The following configuration

finally solved the problem of numbers can not be searched. (This also changed corresponding attribute type)

<the fieldType name = "text_inclunum" class = "solr.TextField" positionIncrementGap = "100">
  <Analyzer type = "index">
 <the tokenizer class = "solr.WhitespaceTokenizerFactory" />
 < class = filter "solr.EdgeNGramFilterFactory" minGramSize = ". 1" maxGramSize = "50" Side = "Front" />
  </ Analyzer>
  <Analyzer type = "Query">
 <the tokenizer class = "solr.WhitespaceTokenizerFactory" />
 <filter class = "solr.




 Since our products are phonetic library field. And it is capitalized. If I use AMXL search can be found to the appropriate alphabet. And then search for the appropriate product amoxicillin. (Solr equipped with all inquiries. Pinyin copy to all fields in.)

But if I can not use amxl search to search. So I put the query value in the program solr query toUpperCase (); finally solved the problem lowercase letters can not be searched.


Question two:

but the next day found the introduction of new problems. If a product is a 'd amoxicillin' Well, I used to search d amoxicillin, will not be able to 'd amoxicillin' search out the product. Start do not know why, put in a test Solr of Analysis. Found. I have a program to turn it into 'D amoxicillin' a query. But SOlr searched is 'd amoxicillin', the all lowercase letters beginning with the product. If the product name such as' 'D amoxicillin' search (autocomplete out), can not be searched out.


Digital solve problems. I have encountered a problem lowercase letters. The Solr not find a solution here. They intended to modify the program. The idea is to query the value of a program variable capital SOlr place instead. If you have a query in the value of Chinese capital will remain unchanged. If it does not change the capital.

In this case. If the product is digital or lowercase letters can be searched out. The full letter can search out according to Pinyin. ( "Solr.EdgeNGramFilterFactory" minGramSize = "1 " maxGramSize = "50" ) This is a word of a left to right.

So online search to find whether there is a regular Chinese string:

 

    / **
             * determine whether a string contains Chinese
             * @param str
             * @return
             * /
            public static boolean isContainsChinese(String str)     
            {    
                Matcher matcher = Pattern.compile("[\u4e00-\u9fa5]").matcher(str);
                boolean flg = false;  
                if (matcher.find())    {    
                    flg = true;   
                }     
                return flg;     
            }  
     
        public static String toUpperOrNot(String temp)
        {
            if (temp == null)
                return "";
            if(StringUtils.isContainsChinese(temp))
            {
                return temp;
            }else
            {
                temp.toUpperCase return ();
            }
        }


Then place the call to the query value SOLR toUpperOrNot () can be. The best call under the following escape.



Tips: Solr query if the query values need to escape special characters:

    public static String NEAD_TO_CONVERT_CHAR Final = "([! / :()])";    
        // Solr Query need to Convert For meaning
        public static String convertMeaningChar (the TEMP String )
        {
            IF (TEMP == null)
                return "";
            TEMP = temp.replaceAll (NEAD_TO_CONVERT_CHAR, "$ \\\\. 1");
            return TEMP;
        }

Guess you like

Origin www.cnblogs.com/cuihongyu3503319/p/11665919.html