A lot of programmers believe they can parse HTML at all.
Go read the official W3C parser algorithm, I’ll wait. First thing you’ll notice is that there is no formal grammar—the spec is of the actual parser state machine. Then you notice each past-and-present HTML version has its own parser algorithm spec, and there is no official documentation on the differences between them, never mind rationale. Then you realize that HTML5 is now a “living spec”, so the parser algorithm at that link occasionally changes, and past versions and changelogs are deliberately not published...
HTML is a parseable format like PHP is a programming language. There is no spec, there is only whatever bugs and quirks a particular browser version happens to contain.
(Oh, you thought browsers actually follow any of those published W3C specs? HAHAHAHAHA sob.)
A lot of programmers believe they can parse HTML at all.
Go read the official W3C parser algorithm, I’ll wait. First thing you’ll notice is that there is no formal grammar—the spec is of the actual parser state machine. Then you notice each past-and-present HTML version has its own parser algorithm spec, and there is no official documentation on the differences between them, never mind rationale. Then you realize that HTML5 is now a “living spec”, so the parser algorithm at that link occasionally changes, and past versions and changelogs are deliberately not published...
HTML is a parseable format like PHP is a programming language. There is no spec, there is only whatever bugs and quirks a particular browser version happens to contain.
(Oh, you thought browsers actually follow any of those published W3C specs? HAHAHAHAHA sob.)
HTML is indeed a turd of a standard.