I've written entire (non-trivial) applications in assembly.
Ultimately the JIT compiler executes machine code, C/C++ executes machine code. C/C++ executes on average 1 instruction per language feature.That's not true. Using adaptive optimization, a Java program could (in theory) perform faster than an equivalent C++ program. In practice, this doesn't happen often today, but it could become a more common occurrence in the future. The JIT compiler has been getting better and better over the years, so it's possible that through adaptive optimization (which the HotSpot JIT compiler already uses) that Java is actually faster than C++ in the future.
JIT must at a minimum execute at least 3
1. Load the next bytecode
2. jump to the appropriate section of the interpreter
3. execute the code that does what the bytecode stands for.
regardless of ho efficient step 3 is it can never be faster than the same functionality implemented in C/C++ since C/C++ could simply use inline assembly to perform the exact same sequence of machine code. And then Java has to perform steps 1 and 2 while C/C++ does not.
If you mean you can take 2 equally poorly implemented horribly cumbersome and completely non-optimized programs and that JIT will eventually run faster than C/C++ perhaps, but No language can guarantee good performance if you intentionally cripple it with poorly written code.
C/C++ will never lose because it can always just implement a java interpreter and achieve parity of performance.