It is kinda 'whooaoaa man, how can that work???' - but I think you are describing the phenomenon known as 'self-hosting':
Languages (or toolchains/platforms) don't start out as self-hosting - they start off life having been built on an existing platform: at a certain point they become functional enough to allow programs to be written which understand the syntax which it itself happens to be written in.
There is a great example in the classic AWK book, which introduces an AWK program which can parse (a cut-down version as it happens) other AWK programs: see link below.
I think the thing to remember on this - if you have (say) a JVM written in Java which can therefore run Java Byte code: the JVM which runs the Java JVM itself has to be hosted natively (perhaps this JVM was written in 'C' and then compiled to machine code) : this is true in any case of a self-hosting program eventually - somewhere along the line.
So the mystery is removed - because at some point, there is a native machine-code program running below everything.
It kinda of equivalent of being able to describe the English (etc) language using the English language itself....maybe...
In the original (and still the best)
book, The AWK Programming Language, the following are implemented (among many other things):
Try doing that with sed.
The AWK Programming Language, by Aho, Kernighan and Weinberger is the best. The initials of the author's names should tell you why...
For gawk, the manual covers the gaps between the language introduced in the book and the latest implementation.