$\begingroup$

The thing is, there's really not much leeway in terms of function encoding. Here are the main options:

Term rewriting: you store functions as their abstract syntax trees (or some encoding thereof. When you call a function, you manually traverse the syntax tree to replace its parameters with the argument. This is easy, but terribly inefficient in terms of time and space.

Closures: you have some way of representing a function, maybe a syntax tree, more likely machine code. And in these functions, you refer to your arguments by reference in some way. It could be a pointer-offset, it could be an integer or De Bruijn index, it could be a name. Then you represent a function as a closure: the function "instructions" (tree, code, etc.) paired with a data structure containing all free variables of the function. When a function is actually applied, it somehow knows how to look up the free variables in its data structure, using environments, pointer arithmetic, etc.

I'm sure there are other options, but these are the basic ones, and I suspect almost every other option will be a variant of or optimization of the basic closure structure.

So, in terms of performance, closures almost universally perform better than term rewriting. Of the variations, which is better? That depends heavily on your language and architecture, but I suspect the "machine-code with a struct containing free vars" is the most efficient. It has everything the function needs (instructions and values) and nothing more, and calling doesn't end up doing large term traversals.

I am interested in both the current encoding algorithm popular functional languages (Haskell, ML) use

I'm not an expert, but I'm 99% most ML flavours use some variation of the closures I describe, albeit with some optimizations likely. See this for a (possibly out of date) perspective.

Haskell does something a bit more complicated because of lazy evaluation: it uses Spineless Tagless Graph Rewriting.

and also in the most efficient one that can be achieved.

What is most efficient? There is no implementation that will be most efficient across all inputs, so you get implementations that are efficient on average, but each will excel in different scenarios. So there's no definite ranking of most or least efficient.

There's no magic here. To store a function, you need to store its free values somehow, otherwise you're encoding less information than the function itself has. Maybe you can optimize away some of the free values with partial evaluation but that's risky for performance, and you have to be careful to ensure that this always halts.

And, maybe you can use some sort of compression, or clever algorithm to gain space efficiency. But then you're either trading time for space, or you're in the situation where you've optimized for some cases and slowed down for others.

You can optimize for the common case, but what the common case is can change on the language, area of application, etc. The type of code that's fast for a video game (number crunching, tight loops with large input) is probably different than what's fast for a compiler (tree traversals, worklists, etc.).

Bonus point : Is there such encoding that maps function encoded integers to native integers (short, int etc in C). Is it even possible?

No, this is not possible. The problem is that the lambda calculus does not let you introspect terms. When a function takes an argument with the same type as a Church-numeral, it needs to be able to call it, without examining the exact definition of that numeral. That's the thing with Church encodings: the only thing you can do with them is call them, and you can simulate everything useful with this, but not without a cost.

More importantly, the integers occupy every possible binary encoding. So if lambdas were represented as their integers, you'd have no way of representing non-church-numberal lambdas! Or, you'd introduce a flag to denote whether a lambda is a numeral or not, but then any efficiency you want is probably gone out the window.