Search with phonetic analysis plugin

Posted by ChenRiang on May 26, 2020

Elasticsearch version : 7.7

In this article, we will look into how we search a keyword that has similar pronouncation (e.g. “write” and “right”) via Phonetic Analysis plugin in Elasticsearch. Checkout the official documentation for more infomation.

Plugin Installation

In this article, we will use docker as the enviroment.Checkout Install plugin on Elasticsearch container for more infomation.

Add the following new inline in docker-entrypoint-es.sh :

1
2
3
4
#!/bin/bash
bin/elasticsearch-plugin install analysis-phonetic
...

Plugin Enablement

In this example, we trying to stimulate a situation that we would enable plugin in a created index.

  1. Create an index called mytest.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
     PUT /mytest
     {
     "settings": {
         "index": {
         "number_of_replicas": 0,
         "number_of_shards": 1
         }
       }   
     }
    


  2. Close the index.
    1
    
     POST /mytest/_close
    

    **Note : Index must be closed before analyzer and filter can be added.

  3. Add the plugin by defining analyzer and filter via Setting API.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
     PUT /mytest/_settings
        
     {
     "analysis": {
         "analyzer": {
             "my_phonetic_analyzer": {
                 "tokenizer": "standard",
                 "filter": [
                     "lowercase",
                     "my_phonetic_filter"
                 ]
             }
         },
         "filter": {
             "my_phonetic_filter": {
                 "type": "phonetic",
                 "encoder": "metaphone",
                 "replace": false
             }
         }
       }
     }
        
    


  4. Re-open the index.
    1
    
     POST /mytest/_open
    


  5. Define mapping that will trigger my_phonetic_filter.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
     PUT /mytest/_mapping
     {
       "properties": {
           "phonetic": {
           "type": "text",
           "analyzer": "my_phonetic_analyzer"
            }
         }
     }
    

Plugin Validation

  1. Index a document with content in a JSON field called phonetic.

    1
    2
    3
    4
    
     POST /mytest/_doc
     {
     "phonetic": "I couldn't remember the right answer."
     }
    
  2. Perform a match query to search for the word right with the word that have similar pronunciation - write

    1
    2
    3
    4
    5
    6
    7
    8
    
     GET /_search?
     {
     "query": {
         "match": {
         "phonetic": "write"
          }
       }
     }
    

    You will noticed the plugin would recognize “right” and “write” is similar pronunciation and return the matched result.
    POSTMAN: Phonetic Result