Runtime code generation and barriers in migrating away from JVM-interal APIs

Hello, I am the author of Byte Buddy and a maintainer of cglib, two of the major code generation libraries in the Java ecosystem. Both libraries are downloaded about 160 million times a year and I wanted to give a report and opinion on the current state of moving away from JVM-internal APIs to save and officially supported alternatives. When code generation tools define classes at runtime, there are currently different alternatives to achieving that: 1) Using sun.misc.Unsafe::defineClass to define a class directly. This is a fairly easy API that even allows defining classes in the bootstrap loader. 2) Accessing protected methods of java.lang.ClassLoader via Java reflection. This allows more fine-grained access to class loading by respecting class loading locks etc. 3) Creating a custom class loader as a parent of the class loader of a proxied class. This avoids any use of internal API but limits proxying to public (and since Java 9 also exported) classes and their protected and public methods. 4) Using a Java agent to define classes using the Instrumentation API. Using this API, it is also possible to gain access to internal APIs as it becomes possible to open encapsulated APIs. 5) Using JNI to define classes using its APIs or to avoid encapsulation altogether. Of course, strategies (1) and (2) were always discouraged and might no longer work in a future Java release due to the encapsulation of internal APIs. Yet, as of Java 9, most code generation tools achieved Java 9 compatibility by migrating from solution (2) to solution (1) thanks to the jdk.unsupported module. Method (5) is rarely used as it requires the inclusion of C code for something that can be achieved easier. As of Java 9, the JVM offers a new approach to defining classes: 6) Using java.lang.invoke.MethodHandles.Lookup::defineClass While Byte Buddy supports this new approach as a user-chosen class definition strategy, for most use cases, the API does not offer sufficient comfort. Code generation is mainly used for the following two purposes: A) When defining a proxy, the proxy class is normally defined in the same package as the proxied class. Doing so, a proxy can be created for package-private classes and it can proxy package-private methods. Using strategy (6), it is however not possible to define a class in a package outside of the package that has created the lookup as this would require PACKAGE access for the target package. If the proxy is created by another module then the module of the proxied class, this access right is never available, even if the proxied class’s module opens its package to the module that generates the proxy. In this context, Strategy (3) is not an option either as the runtime package of the child class loader would be different to the user class package’s class loader. B) When programming a Java agent, a class enhancement makes it sometimes necessary to define an auxiliary class in the same package as the instrumented class. This is similar to javac’s need for such classes where it sometimes defines anonymous classes to provide a certain type for using an API. Unfortunately, the ClassFileTransformer::transform method does not provide a method handle lookup for the package of the instrumented class. At the same time, the Java agent itself typically lives in a different package then the instrumented class such that it cannot create its own lookup what makes (6) inapplicable. Of course, (3) is not an option in this case either. For scenario (A) one could argue that for many use cases, access to package-private classes and methods is not necessary as it breaks the Java programming language's encapsulation model. However, giving such access has been proven useful in the past: the Spring framework does for example induce a bean scope when defining a Java configuration class’s method as package-private. And for the Mockito framework, such access allows for the creation of package-private mocks what avoids that users have to extend the scope of such classes only for a unit test. For (B), a Java agent is able to access internal APIs by opening packages. Providing a method handle lookup for the target class would however offer a cleaner, more standardized approach. It is however unclear what lookupClass the method handle lookup would be assigned to as the instrumented class is not necessarily loaded when the class file transformer is applied. Additionally, some proxying tools such as Mockito require an API to instantiate a class without invoking a constructor. This way, a mock can be created without triggering any user code which might have unwanted side-effects or throw an exception for invalid inputs that are unknown to the mocking framework. I understand that such instantiations are frowned upon as they break the object model. But again, this possibility has proven to be very useful in the past and it would be too bad if such libraries could no longer be maintained in the future. To create instances without invoking a constructor, there are currently several options: 7) Use sun.misc.Unsafe::allocateInstance or the also internal reflection factory. Such use is often done via the Objenesis library. If such access was encapsulated, a Java agent could still open these APIs. 8) Using JNI to avoid encapsulation or allocating an instance without a constructor call from JNI. Again, (8) is a rarely chosen approach but (7) via the use of Objenesis is still common. At a result, even with Java 9 being supported by many popular frameworks, a migration away from internal APIs has not yet been achieved. I would therefore like to suggest the following extensions: C) When a module opens a package, other modules should gain package access to this package when creating method handle lookups. This way, if a user opens a package containing Spring beans to the Spring framework, it could proxy all of these beans as it does today. Since opening a package also permits reflection on package-private types and methods of this package, this is not a security concern either. D) A class file transformer should be provided with an instance of a method handle lookup for the instrumented class as an argument. This way, Java agents gain an easy and standardized way of defining auxiliary classes what is currently rather cumbersome. E) There should be a jdk.test module that is not resolved by default and that is not part of a non-JDK distribution of the JVM that contains an API that allows for the instantiation of classes without a constructor invocation. By depending on this module, test libraries that offer such insecure abilities can also make their intention clear that a library is meant for test and not for production. With Mockito, we regularly get inquiries about performance issues when the library is used in production systems what it is not designed for. This module could also include an API for getting hold of an Instrumentation instance for the current JVM process. This would be useful for many testing libraries such as Mockito and also for testing Java agents under development. Currently, it is necessary to self-attach using the attachment API. Since Java 9, it is additionally required to explicitly allow such self-attachment or to use an intermediate Java process to avoid the constraint. With these three extensions, I believe that the many users of code generation tools could easily migrate away from the use of internal APIs in a few months what would allow a full encapsulation of JVM-internal APIs without any major disruptions. Thank you for your time and feedback on my proposal! Best regards, Rafael