Are all Java programs reproducible?
I am learning about deterministic and reproducible builds and come from a more c/c++ background. Assuming JDK and JVM/JRE are the same, does compiling Java bytecode always yield the same results despite building on different machines? Is Java usually more deterministic compared to c/c++ and other compiled languages?
36 Replies
⌛ This post has been reserved for your question.
Hey @DaMango! Please useTIP: Narrow down your issue to simple and precise questions to maximize the chance that others will reply in here./closeor theClose Postbutton above when your problem is solved. Please remember to follow the help guidelines. This post will be automatically marked as dormant after 300 minutes of inactivity.
No it isn't always deterministic. If you use multiple threads, it stops being deterministic. If you use
Random or similar, it also stops bring deterministic (because that's literally what you want with randomness).
and if you rely on things like the system time or maybe some other OS things or native code, it easily becomes nondeterministic as well
But other than that, it's pretty deterministic (way more than many other programming languages like C/C++) in terms of the result you get - and this is even the case when using different JVMs as the specs don't leave much room for implementation-defined or even undefined behaviour
ofc this doesn't include how long the program takes to execute or how the JIT optimizes it (the actually executed machine code)I see, the multithreading stuff makes sense
I was just wondering in terms of the executable that’s built from compiling
How often do you need to configure the JVM/JRE? Is it a good practice to have explicitly defined configurations?
If that makes sense ^
In C/C++, you have many sources of undefined or implementation-defined behavior (the latter means there are multiple allowed options) coming from the language. For example, in C/C++, the
&& operator doesn't define the order in which the arguments are executed. In Java, the program must behave exactly as if these are executed from left to right (except with multithreading).Gotcha
Do you mean the bytecode or machine-code
I am going to guess that I probably meant bytecode, but I am also interested in hearing more about the machine-code as well
It can be useful especially for optimization and diagnosing issues. But it mostly doesn't have an affect on the determinism we were talking about. But this has some exceptions as well. For example, you can enable/disable cryptographic algorithms (this configuration is something where different JDKs may actually differ) or allowing access to modules/packages that shouldn't be accessible.
The bytecode is fairly close to the Java code but the specification doesn't say what exact bytecode must be produced. The specs say what bytecode instructions must do (JVM spec) and they say what the source code/language must do (JLS) but not how the language is transformed to bytecode.
But in practice, different compilers result in almost the same bytecode.
And these compilers (javac, EJC) are deterministic AFAIK so they produce the same bytecode
Now with machine-code, it's a different story. The JIT is responsible for transforming bytecode to machine-code while the application is running and it does so in a way where it thinks that it results in the best performance. This process is nondeterministic but it must adhere to the rules set by the specification. It can produce machine-code in the way it wants to but it must not change the meaning of the program. While the application is running, it collects a lot of profiling data which it uses for better performance. It makes assumptions about the program (e.g. if something never happened the last 10000 times, it will probably not happen) but it makes sure that the program still works correctly if these assumptions are violated.
You can see the specifications here: https://docs.oracle.com/en/java/javase/24/docs/specs/index.html
The most important (in most cases) are the Java Language specification (JLS) and the Java VM specification (JVM spec)
and also the javadocs (Java SE & JDK spec)
The logical AND operator is evaluated left-to-right in C/C++ as well?
not by spec, at least in some C versions AFAIK
though maybe it was
&Probably &
C Standard guarantees LTR and Short circuit behaviour
note: I meant the order in which operands are evaluated
like
i++True
This is a UB
actually I think it might be something like "if both are true, the right side may be evaluated before the left side"
Afaik almost all operands and some operators don't guarantee LTR or sequence points
Most of those stuff undergoes compiler optimisation and is non-deterministic
so
for example
but idk whether the order is guaranteed here by the spec
yeah something like that
This is safe
sure it's safe - but is it also deterministic?
It's implementation defined behaviour*
undefined behaviour would be "anything could happen"
True. As the evaluation of those statements vary from compilers and architectures and I don't think they document these snippets very well
I think I called it implementation specific behaviour before - that's my fault
It should be. && guarantees LTR and so it'll be executed like this.
I) I++>0, it evaluates to false when I=0, so it'll just shortcircuit
Uhh deterministicity (I cannot spell) is a huge topic and has a lot of debate around stuff
Esp when it comes to low level
ah my fault, the example is wrong
(edited)
Is the value of
i allowed to have one of multiple values?
it would probably always be the same in practice but I am not sure about the spec
yes ik but I care about it being deterministic by the spec/whether the spec allows multiple optionsIt should also be deterministic, the execution is a bit different.
i) i = 1, i > 0 -> true, moves to ROE, post-increment makes
i=2.
ii) ROE: (i=5)>0, sets i=5, and then it evalutes to true.
i is not allowed to have one of multiple possible values at any given step in a non-deterministic way.
Uhh, something like this has an unspecified order of evaluation and other side effects..
Both the examples before were deterministic by spec.Why does your second example have unspecified order?
I don't want to think about C++/your first example lol
It's unspecified by C standard and is open to compiler optimisation
and the function calls are necessary for that?
But yes that's the thing I meant I think
Not exactly
This is also a UB
Isn't that literally what my example did? Or did the
= within the conditional introduce a sequence point?
anyway you still mean implementation defined behaviour
anyway these things are strictly defined in JavaYour example only had one assigment in a sequence point.
whatever I don't want to deal with shenanigans in low level programming languages anyway
True lol
@DaMango Summary: It's complicated in C/C++
Like for example:
this is a UB (cite: c11 standard), but in C++14 and layer, it's not.
that's why you just use rust and enjoy life
no, we use Java here
🤣 Just kidding, I love java, I've been working with it for more than 5 years now.
💤 Post marked as dormant
This post has been inactive for over 300 minutes, thus, it has been archived.
If your question was not answered yet, feel free to re-open this post or create a new one.
In case your post is not getting any attention, you can try to use /help ping.
Warning: abusing this will result in moderative actions taken against you.