I aim is to store all unique term along with their md5 hashes in a database. I have a 1 million document index which has ~400000 unique terms. I got this figure from using aggregations in elasticsearch.

GET /dt_index/document/_search { "aggregations": { "my_agg": { "cardinality": { "field": "text" } } } }

I can get the unique terms using the following:

GET /dt_matrix/document/_search { "aggregations": { "my_agg": { "term": { "field": "text", "size": 100 } } } }