Cost savings

A warm Java program and a binary have very similar execution speed.

The problem is that Java was originally built to run perpetually, and then initial slow startup is not a big concern. With serverless, the lifespan is much lower. Google addressed this with Golang. The JVM world is putting its hope in GraalVM.

How are AWS Lambda charges calculated?

Size of memory

A 256MB execution environment cost twice as much as a 128MB. The Cpu resources are in some respect proportional to allocated memory, but not linearly. We recommend you do a few test executions to see what is best for you. Note: second run is on a warm instance. Re-upload the JAR to ensure you test Cold Start if that is the metric you're looking at.

Execution time

You are charged in ticks of 100ms your handler code is executed. In addition you pay for startup time in code you control. For normal Java that means static initializers and on. For native, everything run in the Launcher. If you can add a bitcoin miner, you pay :)

A binary has no JIT so it requires less memory, which means each tick is cheaper.

Provisioning

To counter slow startup on for example the JVM languages, you can enable Provisioned Concurrency. This means you have a group of hot standbys ready to work. This great for uneven load, but it's costly. Since you only hog memory and not CPU, it's not as expensive as the 100ms ticks.
With native code you can turn off provisioning and it's a huge save!

In our simple tests, native code used 62% peak memory of Java 11's peak memory. The JIT + runtime statistics take a lot or memory.

In our simple tests, total time cold start + first request done took 379ms (on average) on 1024MB instance, and 175ms in native code on 768MB.

As You pay all executed code that's in your control, you pay for 200ms execution for native and 100ms for java 11 in this case.

After startup, Java code is run in bytecode until it has saved enough statistics to do a first JIT compilation. Default for Oracle Hotspot is ~10000 executions of a codepath, but you can specify this to anything including 1. If you JIT it directly you'll have a considerable performance hit, but the same goes if you don't. JIT is not for serverless.

Java JRE + provisioned concurrency

In AWS provisioned concurrency example, they pay $30/mo for provisioned concurrency.
Provisioned Concurrency charges = 7.2M GB-s * $0.000004167 = $30

In AWS provisioned concurrency example, they pay $0.24/mo for the requests, ie their infrastructure to be bothered.
Monthly request charges = 1.2M * $0.20/M = $0.24

In AWS provitioned concurrency example, they pay $11.67 in compute charges.
Total compute charges = 1.2M GB-s * $0.000009722 = $11.67

Total charges = $30 + $0.24 + $11.67 = $41.91

Refrem pricing

Native code does not need provisioned concurrency.
Provisioned Concurrency charges = $0

We have the same number of requests to the AWS datacenter.
Monthly request charges = 1.2M * $0.20/M = $0.24

Since we're working optimally from the start, and consume less memory, conservatively we'll use half CPU and 75% of memory.
Total compute charges = (0.5 * 0.75) 1.2 GB-s * $0.000009722 = $4.37

Total charges = $0 + $0.24 + $4.38 = $4.61

So conclusion is about 90% save.