Brief (but required) history lesson:
Back when IE had 90% or so of the market share, browsers were essentially an enigma. It was hard to know exactly how things worked. Now that open source browsers have a significant part of the market share and there is competition between them in terms of functionality, how the browser works is coming into the light.
There are a lot of rules that front end engineers follow and consider standard, but a lot of F2E’s don’t know why we do some things. It’s crucial that you be able to justify to your clients or your boss why you take the time to follow best practices, and to make decisions based on all of the facts.
The browser at its most basic is just to return a requested resource to the user. The browser can fetch more than just html files; for example, when your browser asks you if you want to download “catpics.zip” it’s still fulfilling a request, it’s just one that it can’t present visually. The browser does know how to visually present HTML, XML, CSS and certain image files though, and the rules for how it interprets them are set out by the World Wide Web Consortium (W3C).
The way I’ve written this process, it seems very step by step and straight forwards. It’s really not, and the process varies slightly from browser to browser. The browser’s rendering engine is responsible for parsing the HTML and CSS and displaying it to the user. When there were no web standards, web developers didn’t have very high expectations of how well their websites would look from browser to browser. The easiest way to be consistent was to use HTML based formatting, and hope for the best. Since then, CSS has been more consistently rendered and there are much smaller difference.
The rendering engine starts by getting all of the HTML and creating the DOM tree. The DOM tree is basically a super dissected version of the html, breaking every element down into its component parts, and categorizing them. For example, when you write a paragraph with HTML, you write the tags and the text inside the paragraph. In a sense, you’re writing out the tree, the paragraph tags around the text are one “node” or “branch” of the tree, and the text inside of your tags is another node that stems from the paragraph node.
Once the rendering engine has assembled the DOM tree, it begins creating the Render tree. The Render tree is what gets displayed to you on the screen, and the DOM tree is the part your browser really cares about in the background, for if it tries to add any data, it’s going to be modifying that part first, and then the render tree will be modified second.
Now the render tree doesn’t have just one step, it has to take into account all of the inherent rules of the DOM nodes, and then apply all of the CSS rules it’s been given, calculate out how things should be organized on the screen, and at last, paint them onto the screen, in the order they appear in the DOM.
Fortunately for the user, the browser wants you to have your content ASAP and will start painting things to the screen as soon as it has enough information to do so. This is a lot like putting together chunks of the puzzle. You put together the parts that make sense, the parts you can see, and then assemble those parts into a bigger whole. If the browser didn’t do this, you would have to wait for all of the HTML and CSS to be fetched and processed.
This is also a very high level, simplified version of how the browser thinks. Each browser works a little differently, and that can effect a lot of different things, from paint times to error handling. See part two of Browsers: The Very Basics for more information.