I'm currently managing a code base in which we've got a mysql database with all records encrypted using the php-encryption library. This works well for our current setup. We now got a new business requirement that should make it possible to do a SELECT based on one of the encrypted fields.

Since it is impossible to select based on the encrypted values, I searched around and found ciphersweet. It's a new (6 months old) repo with currently only 136 github stars. I've read through a blogpost about it written by the company behind the lib.

The idea is based on blind indexing in which the general idea is to store a keyed hash (e.g. HMAC) of the plaintext in a separate column. The blind index key should be distinct from the encryption key and unknown to the database server.

As far as I understand (but I could be wrong here) the value for which I need to search is hashed using the index key which is static per column. The resulting value is searched for in the HMAC column. When a record is found, the encrypted value is then decrypted.

They describe that it does have a duplicate entry leak, meaning that if all records are obtained it can be known which records have the same value, but not what that value is.

I understand the concepts and it sounds ok, but since I'm not a cryptographic expert I can't really judge it's secureness. Isn't it somehow possible to use this duplicate entry leak to do some other attack? I've always learned (=read on the internet) that encryption should always include a IV/Nonce/Salt to make rainbow tables impossible. I guess the usage of the static index key per column prevents these rainbow tables though.

Basically I've got the feeling I'm missing something here. Why is the duplicate entry leak not a problem all of a sudden? Is there anybody else out there who can comment on this library/technology?