Introduction
HTML stands for Hyper Text Markup Language. It was invented by Tim Berners-Lee in 1990 or 1991 but didn't become publicly available until 1993. As this page is being written (starting) on 6/8/2022, the current version of HTML is HTML5.
HTML is the foundational description language for all web pages. HTML consists of special syntax marks (often called "tags") interspersed with text. These tags define things called elements. Elements are defined by either one or two tags. An element has an opening and closing tag if it's capable of containing other elements. Otherwise an element is defined by a single tag containing a forward slash just before its closing angle bracket.
Note:
The HTML specification doesn't absolutely require closing tags or closing forward slashes, but your life will be much easier if you use them. All of the Web Workmanship project uses them.
Note:
Because Internet Explorer doesn't adhere to many HTML standards, nothing in Web Workmanship has been tested in Internet Explorer. The complexity cost of accommodating Internet Explorer isn't worth getting this ancient browser with less than 2% market share to render properly.
Element example: Two paragraphs and a horizontal line
A paragraph and a horizontal rule are both HTML elements. The following is an HTML5 snippet of two tiny paragraphs followed by a horizontal line:
<p>This is the first paragraph.</p><p>This is the second paragraph.</p><hr/>
In the preceding notice:
- <p> starts a paragraph.
- </p> ends the current paragraph.
- <hr/> both begins and ends a horizontal line element.
- A horizontal line element cannot contain other elements, which is why it's ended with a forward slash before the closing angle bracket, instead of having a separate ending tag.
- Elements needn't start or end on their own line.
About line breaks in HTML source:
The HTML specification allows line breaks within HTML source code anywhere you can put a space, and line breaks act as spaces. For readability you should always precede the paragraph beginning and the paragraph end with a blank line.
Some will argue that for smaller uploads you shouldn't do this. My reply is that if you're that sensitive about upload size, you shouldn't be using the huge CSS files that most folks use. If you simply must eliminate whitespace that is so necessary to improve readability, then use a converter program to squeeze out the whitespace and write to a separate file, leaving your spaced HTML in place, and upload that no-space HTML file.
Use End Tags
The examples on this page all use either end tags or an ending slash. An example of end tags would be using </p> to end <p>. For elements that cannot contain other elements, a single tag both begins and ends them by placing a forward slash before the closing angle bracket. An example is <hr/>.
Theoretically, HTML doesn't require end tags on container-capable elements or closing forward slashes on non-container-capable elements. If you want to have a good, productive and easy life in HTML land, always close your elements using the proper closing technique. The reasons for this are discussed later in this page.
More About Elements
Elements perform tasks such as importing images, importing videos, splitting text into paragraphs, splitting lines within the same paragraph, creating horizontal lines within text, and creating headings of various levels (1 to 6).
Other HTML elements can be used to move to other web pages (using links, otherwise known as <a></a> elements). Still other elements can be used to create sections which can later have appearances attached to them via CSS and/or actions attached to them via Javascript.
Container-capable element <pre></pre> is special, because most of what it contains is taken literally. For instance, ten consecutive spaces are rendered as ten consecutive spaces, instead of compressing them into one space, as is done inside other HTML elements. Line breaks are rendered as line breaks, and multiple line breaks aren't compressed down into one line break or one space. As a result, <pre></pre> is what you use when showing multi-line source code or multi-line command line operations.
Because <pre></pre> renders most HTML elements as elements rather than literal text, multi-line command line operations can have special coloration for what the user types in. Consider the following example of a series of commands and their output, within a <pre></pre> element:
[slitt@mydesk web]$ ls -l examples total 8 -rw-r--r-- 2 slitt slitt 1372 Jun 7 19:40 innerhtml_fail.html -rw-r--r-- 2 slitt slitt 1372 Jun 7 19:40 innerhtml_fail.html.txt [slitt@mydesk web]$ ls -l perceived total 76 -rw-r--r-- 2 slitt slitt 5785 Jun 4 17:41 annoying.html -rw-r--r-- 2 slitt slitt 5785 Jun 4 17:41 annoying.html.txt -rw-r--r-- 2 slitt slitt 19096 Jun 4 19:18 less_annoying1column.html -rw-r--r-- 2 slitt slitt 19096 Jun 4 19:18 less_annoying1column.html.txt -rw-r--r-- 2 slitt slitt 6203 Jun 4 17:22 less_annoying.html -rw-r--r-- 2 slitt slitt 6203 Jun 4 17:22 less_annoying.html.txt -rw-r--r-- 1 slitt slitt 939 Jun 2 18:46 side_by_side.html [slitt@mydesk web]$
Notice that the commands typed by me are violet, while the computer's responses are black. This is almost trivial to do, but it requires CSS and element classes, so it won't be discussed further on this web page.
Element Attributes vs Element Child Elements
Any HTML element can have attributes. An attribute is just a fact about the element. It's metadata.
Some HTML elements are allowed to contain other elements, known as child elements or children. Such child elements are not attributes. This is a possible source of confusion that should be put to rest.
From a meaning point of view, attributes are always about their element. Attributes cannot have their own attributes, nor can they contain child elements. From a meaning viewpoint, child elements can be, but probably are not, about their parent.
The following is an example of attributes:
<div class="boat_desc" id="rowboat">My boat.</div>
In the preceding, the class attribute of the <div></div> element has a value of "boat_desc", and the id attribute has a value of "rowboat".
The following illustrates child elements that are not descriptions of the parent element:
<div> <p>This is my first paragraph.</p> <img src="mypicture.png"/> <p>This is my second paragraph.</p> </div>
In the preceding example, the first paragraph, the image, and the second paragraph are all children of the <div></div>element, in their order of appearance.
This brings up another point. Child elements have an order: One comes before another which comes before a third. Attributes of an element have no order; a parser might give them in one order one time and another order a second time. If you absolutely must have them in order, you can use Javascript to sort them, but that's waaay beyond the scope of this document.
Also, having child elements describe the parent is so hypothetical and contrived in HTML that I can't think of an example, and if there were such a configuration, it would need Javascript to make it work as expected.
Note:
Unlike HTML, in XML it's occasionally helpful to have child elements describe a parent, but in general it's a bad idea in XML. XML has its own page in the Web Workmanship project.
Please be conscious of the fact that attributes are different from child elements, they're treated differently, and generally speaking, facts about an element are best expressed as attributes, while things that are inside an element must be expressed as child elements.
Attribute Syntax
An attribute describing an element must be positioned inside the opening tag (or only tag) for the element. First comes the attribute's name, then an equal sign, then the attribute's value within quotes. The following is an example:
<p id="park">Shortpar.</p>
In the preceding, the element is a paragraph which contains text "Shortpar". This paragraph element has an attribute called id. This attribute has a value of park, which is enclosed in double quotes. There is no space on either side of the equal sign.
Use the Quotes, Lose the Spaces
I know, I know, it's at this point someone will bring up the fact that the HTML specification allows spaces around the equal sign, and allows the attribute's value not to be quoted if it's a single word with no punctuation. OK fine, for sure for sure, but if you want a nice, easy life in HTML land, always quote the attribute value, and always remove spaces from around the equal sign.
Which kind of quotes?
There are a lot of kinds of quotes these days. Single quotes, double quotes, back quotes, smart double quotes, smart single quotes, and probably several others. You want to surround your attribute value with one of these two kinds of quotes:
- ", which is ASCII 34 decimal, UTF-8 34 decimal, usually called double quote.
- ', which is ASCII 39 decimal, UTF-8 39 decimal, usually called single quote.
This is probably obvious, but use the same kind of quote on both ends of the attribute value.
Most people use double quotes unless the attribute value contains double quotes but no single quotes, in which case they surround the attribute value with single quotes. If you find yourself in a situation where you must use the same kind of quote, inside the attribute value, that you use to surround the attribute value, just use one of the following inside the value:
- " for a double quote
- ' for a single quote
Take a Moment
This section about attributes and child elements has been long and perhaps difficult. Take a moment to relax and contemplate what you've read. If you need to go back and read part or all of this section again, there's no shame in that.
Common HTML Elements
This section introduces frequently used HTML elements. It's divided into two subsections, containers and non-containers. Keep in mind that an HTML container element is an element that can contain other elements, whether or not it actually does contain other elements.
You needn't memorize the info in this section. Just understand it and know it exists. I recommend you bookmark it.
You Don't Need To Memorize This Section
This section gives you information on some common elements. You needn't memorize this section at this time, because you can look back at it frequently. Just read this section well, make sure it makes sense to you, and you'll have gotten value from it.
Container elements
Structural container elements
- <html></html>: html element: The root element of a web page.
- <head></head>, a direct child of <html></html>: The head element, which contains metadata and CSS (discussed later in the Web Workmanship project). The <head></head> element must come before the <body></body> element, which is the only other direct child of the <html></html> element.
- <title></title>, a direct child of <head></head>: The title element, contains the text of the title.
- <style></style>, a direct child of <head></head>: A style element, of which there can be many, contains on-page CSS declarations. There can be more than one of these, and they should usually come after any <link/> elements.
- <script></script>, a direct child of <head></head>: can be used to import Javascript or contain on-page Javascript.
- <body></body>, a direct child of <html></html>: The body element: This is where all the content goes. This element must come after the <head></head> element.
Body-based container elements
All the following elements are direct or indirect children of the <body></body> element.
- <p></p>: A p element that holds a paragraph. Various other elements, <span></span> for example, can be interspersed within the paragraph text.
- <span></span>: span element: A part of a paragraph set apart for different appearance (such appearance is defined by CSS). <span></span> can only contain text.
- <a></a>: a element: This defines a hyperlink that sends you to a different page or a different part of the current page.
- <pre></pre>: pre element: Content inside this element keeps its indentation and line spacing.
- <video></video>: video element: This element puts a video on your web page.
Non-Container Elements
- <meta/>, a direct child of <head></head>: This element declares metadata about the web page. There can be more than one <meta/> elements.
- <link/>, a direct child of <head></head>: This element is used to import a CSS stylesheet.
- <hr/>, a direct or indirect child of <body></body>: This element draws a horizontal line across the page. You can vary the length, width, color and style of the line with attributes, or with CSS.
- <br/>, a direct or indirect child of <body></body>: Use this element to create a line break on the browser-rendered content, without starting a new paragraph.
- <img/>, a direct or indirect child of <body></body>: This element is used to insert a graphic into your web page.
A Simple Example
You can see a web rendering of a simple HTML file. You can also view it as source HTML.
Special Characters
You needn't memorize the info in this section. Just understand it and know it exists. I recommend you bookmark it.
There are three special characters in HTML that trigger parsing and therefore cannot be used as content in HTML:
- &
- <
- >
Initially this seems like a problem because angle brackets are used constantly in math and computer programming and the ampersand (and sign) is used constantly in company names such as Dewey, Cheatham & Howe. And don't even think of substituting the word "and", such companies get very touchy about that.
Fortunately there are easy ways to write content so that the web browser renders them the way you want, even though in the HTML code they're written very differently. Consider the following:
- & can also be written as & (ampersand) or & within your HTML code.
- < can also be written as < (less than) or < within your HTML code.
- > can also be written as > (greater than) or > within your HTML code.
Generally speaking, the abbreviations are customarily preferable to the ASCII/UTF-8 number representations.
Special Symbols
HTML has some special symbols to represent things not easily represented as text. This section has a list of some common ones. You don't need to memorize the list: Just know they exist so you can look them up and use them when needed. You might want to bookmark this section. The list follows:
- ©: Represented by ©: Copyright symbol.
- ®: Represented by ®: Registered trademark symbol.
- &: Represented by &: Ampersand symbol.
- <: Represented by <: Less than symbol.
- >: Represented by >:Greater than symbol.
- ✓: Represented by ✓: A check mark.
- €: Represented by €: Euro currency symbol.
- ¥: Represented by ¥: Japanese Yen currency symbol.
- £: Represented by £: British Pound currency symbol.
- $: Represented by $: Dollar currency symbol, handy when your keyboard lacks a dollar key.
- λ: Represented by λ: Lambda symbol.
- “: Represented by “: Slanted left “double” quote.
- ”: Represented by ”: Slanted right double quote.
- ‘: Represented by &sdquo;: Slanted left single quote.
- ’: Represented by ’: Slanted right single quote.
- …: Represented by …: Ellipses.
- █: Represented by █: A block symbol.
- —: Represented by —: An M dash.
- –: Represented by –: An N dash.
- ‐: Represented by ‐: A short dash.
- ☆: Represented by ☆: An little star.
- •: Represented by •: A round bullet.
Where to Go From Here
With this page you've received a great introduction to HTML. You've learned that HTML is made out of text and elements which are defined by tags. You've learned that unless inside a <pre></pre> element, line breaks render as spaces and multiple consecutive spaces render as one space. You've learned about container-capable elements and non-container-capable elements, and how the former should have an end tag, while the latter should have a forward slash before the closing angle bracket.
You've seen the difference between attributes and child elements, and mastered the attribute syntax. You've skimmed the common elements, divided into container and non-container categories, and know to come back to this page as a reference. You've seen a simple HTML file example both as HTML code and as rendered on your browser. You've learned about the three special characters that affect parsing, and how to write them in HTML if you need them to render literally. You've seen a list of many special symbols, which you can refer back to at any time. You now have an excellent grounding in HTML, although of course you'll have to refer back to this page until you've memorized the material.
The next step is to learn XML. You probably want to know why knowing XML is important for fast, efficient and correct HTML authoring. To find out, and become a much better HTML author, read the Web Workmanship project's XML primer.