HTML tutorial

Basic HTML syntax

For the purposes of illustration, this page alone contains some invalid HTML tags, like <foo></foo>, and invalid LaTeX markup. For a list of all the valid HTML5 tags, consult the official specification.

If you are already familiar with LaTeX, you have probably encountered a few ways of grouping text for specific formatting.

\begin{foo}
  This text is in an environment. It always has a "begin" command and a
  matching "end" command.
\end{foo}

\bar{This text is just wrapped in a command. This block ends at the
closing curly brace.}

\section{Baz is the best}
The section command implicitly ends at the start of the next section,
or perhaps the end of the document. It doesn't have any explicit
marking that closes it.

HTML tends to be much more explicit, and the pattern for opening and closing a block is almost always the same.

<foo>This text is the content of this "foo" tag. It gets formatted according
     to the styling of the "foo" element but may also have class- or
     instance-specific styling.</foo>
<bar>You can nest practically anything.
  <baz>Like this! This text is formatted according to global styling
       for the "baz" element, then styling for any classes, then
       any inline styles. Since it is nested in a "bar," the stylesheet
       can also specify styles for all "baz" elements inside "bar"
       elements.</baz>
  Back in the "bar" element but outside the "baz" element.
  Remember to close your tags! HTML does not enforce any particular
  whitespace conventions, but indenting can help you see how elements
  are nested.</bar>

If the initial tag is <qux>, then the closing tag will be </qux>.

The / character is a forward slash. It is not a backslash. Every time you incorrectly call it a backslash, a programmer somewhere on Earth silently weeps for humanity.

Many start tags take additional options as key-value pairs inside the angle brackets, and these are never repeated in the end tag. Not every key is valid inside every HTML tag, but the standard is very clear about which keys and values are valid for every tag.

There are a very few cases in HTML where a closing tag cannot be given. In those cases, the tag by definition cannot have any contents, so an explicit closing is unnecessary. These tags denote void elements and include:

<area />, <base />, <br />, <col />, <command />, <embed />, <hr />, <img />, <input />,
<keygen />, <link />, <meta />, <param />, <source />, <track />, <wbr />

None of these tags are likely to be relevant in everyday use by academics, with the possible exceptions of <br />, which indicates a manual linebreak, and <img />, which embeds static images.

The <p></p> tag is a better option than <br /> if the linebreak simply indicates a new paragraph.

The angle brackets < and > are special punctuation in HTML and should not be used in regular text; the parser will try to interpret whatever follows as if it were a tag. HTML has a special syntax for entering those characters as well as many others in a way that does not confuse the parser.

<p class="joke">Why is 6 afraid of 7? Because 7 &lt; 8 &lt; 9</p>

Some common characters you might use include &lt; (less than), &gt; (greater than), &ndash; (en dash, like --), &mdash; (em dash, like ---), &ldquo; (left double quote, like ``), and &rdquo; (right double quote, like ''). Finally, since the ampersand itself is a special character, you enter an ampersand in text with a similar code, &amp;.