In this part of our Elasticsearch tutorial blog series, we are going to look further into mappings in Elasticsearch. We have seen the basics of index creation and common settings in our previous blog about Dynamic Index Creation. Here, we will go deeper into Elasticsearch mappings in the settings of index creation.

Mapping

Mapping consists of the properties associated with the documents in a specific index type; such as, string, date, integer of each field in that document. So defining the mapping when we create an index plays a very important role, as inappropriate mapping could makes things difficult for us.

Mappings can be applied in many methods such as to the types of an index, to particular fields and both can be done in a static and in dynamic ways.

Sample Data

For this blog, let us consider a few documents containing minimal data on some cuisines by popular chefs.

Document 1

{ "cusinename" : "August Chopped Salads", "chefname" : "Thomas Keller", "cusine_des" : "Tulies shaped into tiny cones and topped with creme fraiche and fresh salmon" "chef_des" : "At his flagship New Orleans restaurant, August, Thomas Keller makes his chopped salad with 21 different kinds of vegetables and herbs.", "restaurantname" : "New Orleans Restaurant", "location" : "40.78,91.12" }

Document 2

{ "cusinename" : "Chilled Spring Pea Soup", "chefname" : "William Shors", "cusine_des" : "Mix of sweet peas, favas, pea shoots, snap peas and snow peas" "chef_des" : "Realised his passion for cooking at a very early age, he is now one of the most famous chefs in the world", "restaurantname" : "Café Boulud ", "location" : "30.78,91.12" }

Dynamic Type Creation

When we create an index in Elasticsearch, we also define a type along with it. This type is a representation of a class of similar documents. Every type has 2 components associated with it, the first one being its name and the second one is its mapping.

As mentioned in the introduction, the mapping to a type can be static or <a href="https://www.elastic.co/guide/en/elasticsearch/reference/1.7/mapping-dynamic-mapping.html" target="_blank">dynamic</a> . Static mapping for types and fields have already been mentioned in the previous blog.

In most cases, we will be dealing with similar mappings for all the types in the index. But in some cases there would be an exception, like a few types would have some different mappings.

For example, in an index named testindex there are two types type1 and type2 associated with it. In the type1 , we need to have the _source field to be shown or enabled, though the same field is not required to be part of type2 . Here is how we represent that:

curl -XPUT 'localhost:9200/testindex' -d '{ "mappings": { "_default_": { "_source": { "enabled": true } }, "type2": { "_source": { "enabled": false } } } }'

To see what happened with the above query, index two sample documents in the index under the types type1 and type2 like below:

curl -XPUT 'localhost:9200/testindex/type1/1' -d '{ "name": "Alister Cain" }'

curl -XPUT 'localhost:9200/testindex/type2/1' -d '{ "name": "Mathew Cors" }'

Then run the following query in the terminal:

curl -XPOST 'localhost:9200/testindex/_search?pretty=true' -d '{}'

After running the above query, we can see that the document under type1 has _source field in it and the document under type2 has no _source field in it

Dynamic Mapping

Occasionally, documents we index will have extra fields. For example in this case, a third document getting indexed might have a new field, for which we haven’t specified the mapping.

When this happens, Elasticsearch uses “dynamic mapping” to determine the datatype for the field and automatically add the field and the type information in the corresponding mapping.

This is useful in many cases, but sometimes we may not want this to happen. We might want to ignore the new fields added or throw an error to alert us whenever a new field is detected. All these three scenarios are taken to account by Elasticsearch and have provided with the “dynamic” setting, which can be set to any of the following three values:

true: add new fields automatically. This is the default setting.

false: ignore the new fields

strict: throw an exception error when a previously unknown field is detected

In the previous example of field-level mapping, we can see that the dynamic setting property has set the value to true.

Customizing Dynamic Mapping

As we have concluded from the above section, it is preferable to set the dynamic property to true , when there is new field addition on the fly. Even though this can be helpful, in some cases we might need to modify this behavior too. These cases and the ways to tackle them are discussed in the next two sections.

Auto date detection

In our sample data set suppose we add a field called exhibitionDate , which has date values. But in some cases, the value for the exhibitionDate field can be “not available” . But as we have enabled the dynamic property to be true , we will be getting an unexpected problem. This happens when the first document with exhibitionDate field value as a date value is indexed. Elasticsearch will map this field as the type date .

Now when we try to index the document with exhibitionDate value as “not available”, Elasticsearch will still be expecting a date value there, but it encounters a string type. This causes error.

We can tackle this problem by customizing the dynamic mapping by disabling the automatic date detection. This can be done with the following settings:

curl -XPOST 'localhost:9200/testindex2' -d '{ "mappings": { "details": { "date_detection": false } } }'

dynamic_templates creation

Elasticsearch gives us the option to use dynamic templates for the newly detected fields. Dynamic templates can also be employed for applying different mappings depending on the field name or datatype.

In our sample data, we can see there are some fields with the same mapping, like cusinename , chefname and restaurantname . There is the possibility of adding new fields on the fly too. In case we decide, for the future, that all the fields ending with name should have exactly the same mapping as of the above three.

Now looking at those fields, we can conclude that they all end up with name . By employing dynamic templating, we can create a common template for mapping the fields ending with the word name . Since those fields are the ones that users would like to search directly. So it’s good to not tokenize it.

In order for Elasticsearch to not tokenize a field, we need to set the mapping for the field as not_analyzed . Now the latlong , field is that of geographical coordinates which is to be mapped to geo_point type. This can be done like below:

curl -XPOST 'localhost:9200/testindex3' -d '{ "mappings": { "details": { "dynamic_templates": [ { "name_template": { "match": "*name", "mapping": { "type": "string", "index": "not_analyzed" } } } ], "properties": { "latlong": { "type": "geo_point", "index": "not_analyzed" } } } } }'

What we have done here is that we have defined a template named name_template . Inside the name_template there is a parameter called match , which is given the value as *name . This will match all the fields whose names end with name . Then under the mapping section we have defined the mapping we need. So this will apply to all the fields which end with name in the documents to be indexed.

After doing the above, check for the mapping in this empty index using the GET mapping API as below:

curl -XGET '<a href="http://localhost:9200/testindex3/_mapping/details?&pretty=true">http://localhost:9200/testindex3/_mapping/details?&pretty=true</a>'

Now let us index document 1, in the index testindex3 like below:

curl -XPOST 'http://localhost:9200/testindex3/details/1' -d '{ "cusinename": "August Chopped Salads", "chefname": "Thomas Keller", "cusine_des": "Tulies shaped into tiny cones and topped with creme fraiche and fresh salmon", "chef_des": "At his flagship New Orleans restaurant, August, Thomas keller makes his chopped salad with 21 different kinds of vegetables and herbs.", "restaurantname": "New Orleans restaurant", "location": "40.78,91.12" }'

After indexing, check the mapping using the above mentioned GET mapping API request and in the response we can see the mappings are applied as we specified in the template.

Conclusion

In this blog, we have familiarized with various mapping procedures in Elasticsearch, including the type level dynamic mapping, enabling or disabling auto date-detection, and creating dynamic templates for mapping.