Chapter Four

Putting Your Information On Line


Making information available via the Internet can occur in many ways. If you have a direct connection to the Internet (as opposed to connecting through an Internet service provider), you can establish publicly-accessible files which reside on your computer or network. For most users, however, creating a homepage on the World Wide Web and linking information to it is the easiest way to establish an Internet presence.

4.A. Homepage Creation

As discussed previously, a homepage is a World Wide Web document that acts something like a table of contents for other files which are linked to it. To build a homepage, you need several things. First, you need permission from your Internet provider and instructions relating to your specific site (for example, where to put the documents you create). Second, you need a guide to HTML and a word processor capable of creating ASCII or text documents. Third, and optionally, you may want an HTML authoring program that can help relieve you of the drudgery of typing in all those HTML commands.

4.A.1. What is HTML?

HTML is the acronym for "hypertext mark-up language". It is the language used to create the documents for the World Wide Web. Although most browsers will display any document that is written in plain text, there are advantages that you get by using HTML. When HTML documents are read by applications specifically designed for the Web (usually called browsers or Web browsers) they can include formatting, graphics, and even links to other documents. HTML is the encoding system which allows browser programs to display text and other features in a graphical format and to create links to other documents.

Like any other computer language, HTML uses a combination of keyboard letters and symbols to direct the translation program. In this case, the codes direct the browser program to do two basic things: 1) display text in a certain way on the screen, and 2) to find and open other documents which are linked to the text through embedded codes.

HTML was conceived by the inventor of the World Wide Web, Tim Berners-Lee, and was further developed by Dan Connolly, Dave Raggett and a team of volunteers. Early on it was recognized that users, developers, and the authors needed to have a reference point for HTML so that there was agreement about the meaning and usage of the language. This team now forms the Internet Engineering Task Force (IETF) Working Group of HTML.

Like any new application, HTML is still evolving. You may come across references to several versions:

4.A.2. How HTML Works

Like any other computer language, HTML uses a combination of keyboard letters and symbols to direct the translation program. Because files have to be portable between systems, HTML uses only plain text: the characters A-Z, a-z, 0-9 and punctuation which can be used on all computer systems. If you use a word processor like WordPerfect, Word, or MacWrite, you must always be sure to save your document as "text only." If you use a text-only editor like TeachText, SimpleText, or EMACS, this will happen automatically.

HTML uses textual markup to identify the structure of your document. It does this by using keywords calls tags to surround portions of your text, so that the computer can recognize the elements which make up your document. In other words, HTML text documents begin with simple text, adding commands (or tags) at various places to tell the browser program what to do. Thus, you may create the entire document on the word processor you are comfortable with, add the proper commands (HTML tags), then copy the encoded text to the Internet following the requirements of your Internet service provider. When a user visits your homepage, none of the codes show; the browser program displays only the text and images... hopefully displayed as you intended.

4.A.3. HTML Tags

In general, HTML commands begin with a < and end with a >. The commands are almost never case sensitive and are either "container" or "separator" commands (although there are numerous exceptions to both of those generalizations). A container command means that there is a beginning command and an ending command. An example of a container command is the title command, which surrounds the text that is designated at the document’s title with <title> and </title>. An example of a separator command is the command used to separate paragraphs (<p>).

You should notice a few things about the way these element tags are used:

The names of element tags are predefined in HTML: you can’t make up your own. However, some tags perform similar tasks and older versions may be seen as HTML continues to evolve. (For example <br> is used to mark special line-breaks which may sometimes be seen in place of <p> the paragraph separator.) Element tags are ‘case insensitive’, so typing TITLE or Title means just the same thing.

4.A.4. Elements and Attributes

The word ‘tag’ is used to describe the individual start- and end-tags; the word ‘element’ refers to the whole element, including both start- and end-tags and the text between them.

Elements fall into one of three classes:

In HTML, most elements are structural; there are also many descriptive elements though fewer visual ones. The objective is to concentrate on content and meaning, rather than how it happens to look on any one user's machine.

Some start-tags can hold additional information. These items are called attributes, and they occur inside the angle brackets of the start-tag, after the tag name. They are used to make finer distinctions about the meaning or use of an element, or to hold important information which is not to be displayed as part of the text. For example:

<pre width="72:">

The width attribute specifies that the preformatted text defined by the <pre> element should be displayed with a maximum line length of 72 characters per line.

Attributes are separated from the element name and from each other by a space. They usually take the form of keyword="value" pairs, as in the above example, although in some cases you can use the keyword on its own.

4.A.5. General Form

Every HTML document should follow this general form:


<html>
<head>
<title>Title of Page</title>
</head>
<body<
The text and HTML tags that define your page go here
</body>
</html>

(Remember that hard returns don’t matter, so this could all be on one line if you wanted.)

The entire document consists of HTML tags, which begin with the <html> and end with </html>. Header information is given between the <head> and </head> tags. This should include the document title that will appear at the top of the browser window. The entries between the <body> and </body> tags will appear in the browser window. Many HTML tags come in pairs, be sure to nest all paired tags.

4.A.6. The Absolute Essentials

If all you want to do is convert an existing document or file into a Web document, a few simple steps will give you an organized and readable result. This is a good way to make meeting minutes, announcements, or agendas available over the Internet.

That’s it, you’re finished.

These steps produce a very simple document which is easy to create and read. They do not really tap the potential of Web pages, however. If you want to include more sophisticated formatting, or include links to other documents or graphics, you will need to read on. For everyone who is used to having a lot of control over the look and feel of a page (such as publishers), a paradigm shift occurs with using HTML. The combination of structure of the Web and the browser programs which display it actually gives the reader more control over the look of the document than the publisher. Crafting pages to look exactly as you want them on one browser may display differently or be inaccessible on another.

4.B. The HTML Document

4.B.1 The Heading

Every HTML document should start by declaring itself as such. A HTML file should be self-documenting: that is, it should contain some information about itself, so that you can identify the file without having to read through it all. You do this with the header, in which you can specify the title of the file and a variety of other information about it.

A header with a title must occur in every file. Here is an example of a header with a title:


<head>
<title>Tahoe Region</title>
</head>

The <title> element is a kind of label for recording the function of the file: it is not a part of the document text. Most browsers show the file title at the top of the screen, separate from the text, either off to the right-hand side or in a separate panel.

4.B.2 The Body

All of the file after the header is enclosed in the <body>and </body> tags. This is where all the text, illustrations, forms, tables, and hypertext references go. The body of most documents consists of a mixture of elements: some are simple one-line items like section headings, subheadings, and illustrations; others are blocks of text like paragraphs and lists, but a lot depends on the nature of the material and how you want to present it.

HTML is easiest to use with documents with a fairly rigid structure - ones with a definite outline of headings, subheadings, and lists. It is not required, but it is good practice to write your document so that the heading "levels" used reflect the organization of your document. For example the first heading should be a "level 1" heading, subheadings should be "level 2", and so on.

4.B.3 Heading Levels

Most browsers recognize at least four heading levels. There is support in HTML for more than that, but after the fourth level, it gets difficult to tell the heading levels apart. If you get much beyond that, you should consider breaking up your document into multiple pages.

The heading commands look like <hx> and </hx>, where ‘x’ is the heading level. In most documents on the Web, the first heading is a duplicate of the document’s title. A document with three different headings called "City Council Meeting Minutes" would have some variation of the following format:


<html>
<head>
<title<City Council Meeting Minutes</title>
</head>
<body>
<h1>City Council Meeting Minutes</h1>
<h2>February 2, 1996</h2>
<h3>City Council Chambers</h3>
The minutes to the City Council Meeting go here. More text to be added.
</body>
</html>

When viewed on a browser, the above document would look something like this:

City Council Meeting Minutes

February 2, 1996

City Council Chambers

The minutes to the City Council Meeting go here. More text to be added.

4.B.4 Paragraphs

Normal paragraphs are separated with the <p> command. This is an example of a separator command, although it can be used as a container command also. You can place the paragraph marks after each paragraph or they can just as easily go in front of each paragraph. The <p> is just a separator. In the above example, adding the paragraph command could look like this:


<html>
<head>
<title<City Council Meeting Minutes</title>
</head>
<body>
<h1>City Council Meeting Minutes</h1>
<h2>February 2, 1996</h2>
<h3>City Council Chambers</h3>
The minutes to the <p> City Council Meeting go here. <p>More text to be added.
</body>
</html>

In that case the browser would display the following:

City Council Meeting Minutes

February 2, 1996

City Council Chambers

The minutes to the

City Council Meeting go here.

More text to be added.

4.B.5 Lists

There are three basic types of lists: unnumbered, numbered and definition. The lists may be nested as well. An unnumbered or numbered list begins with a tag that identifies the type of list, and then uses the <li> tag to denote the beginning of each list item. A definition list requires separate tags for each term and definition.

Ordered Lists.
In ordered lists the browsers take care of inserting the actual numbers. This behavior is convenient for authors because if you insert or delete items in a sorted list, you don’t have to worry about renumbering everything. An ordered list begins with <ol> and ends with </ol>. For example:

<ol>
	
<li>This is the first item in an ordered list.
	
<li>This is the second item.
	
<li>They can also be called numbered lists. 
	
</ol>

Looks like this:

  1. This is the first item in an ordered list.
  2. This is the second item.
  3. They can also be called numbered lists.

Unordered Lists.
Unordered lists typically use bullets to mark off each item in the list, but this is up to the browser (a DOS browser may use asterisks or dashes, for example). An unordered list begins with <ul> and ends with </ul>. For example:

 <ul>
	
<li>This is the first item in an unnumbered list.
	
<li>This is the next.
	
<li>The third item itself can be a nested list:
	
<ul>
	
<li>first item in the nested list
	
<li>second item, be sure to end both lists!
	
</ul>
	
</ul>

Looks like this:

Definition Lists.
A definition list is a very flexible type of list that is more useful than its name implies. It’s useful for lists where a bit of explanatory text should accompany each item. Each item in the list has two parts, a term (indicated with the <dt> command) and a definition (which uses the <dd> command). The list itself is started with a

command and closed with a </dl> command. Here’s a sample of a definition list used to explain some of the hierarchy of a government department:

<dl>
<dt>Department Chair
<dd>An appointed position with authority over all department functions who serves at the pleasure of the Council.
<dt>Team Leader
<dd>A supervisory staff position with responsibility for program development and implementation.
<\dl>

Looks like this:

Department Chair
An appointed position with authority over all department functions who serves at the pleasure of the Council.
Team Leader
A supervisory staff position with responsibility for program development and implementation.

4.B.6 Links

Links are what make Web pages unique. Possibly the best part of the web is the ease with which you can go from one document to another, or load movies, pictures, sounds, and programs. These are accomplished through links. A link gives the location of a file, and the method your computer needs to retrieve it. The most complicated part of linking is the URL that points to the resource you are linking to. (See Chapter 1 for a description of a URL.) The basic tag is the <a> for anchor. A link to another web page has this structure:

<a href="http://www.ceres.ca.gov/index.html">CERES</a>  

The browser highlights the word CERES on the screen. When the user clicks on the highlighted word, the browser jumps to the link, which in this case is the URL for the California Environmental Resource Evaluation System homepage. The browser then opens the file called "index.html" and displays it on the screen.

There are two types of links: absolute and relative. Absolute links give the full URL of the other file. The example above is an absolute link. Relative links give the location of the other file in relation to the current document displayed on the browser.

Absolute links are the full URL of the other file. Web browsers use URLs as a standard format for the information required to access other files. Each URL contains the type of file and its location (e.g. http for a web page and ftp for a transferable file). If you provide a link to a file that will be downloaded or displayed by a "helper" program (i.e. an ftp program) rather than the browser itself, a common courtesy is to identify the size and type of file in your HTML document. This allows the user to estimate the time it will take to load it on their machine before getting stuck waiting.

Use absolute links when the document to be called is not resident on your server. (Note: if the URL for the linked document changes, the absolute reference must change also.)

Relative links give the location of another file in relation to the current one when both are located within the same host directory. Relative links are nice because they are portable. You can switch servers and be able to set things up in the new location with no hassles; simply copy all of the files into a directory on the new machine, and organize them in the same way they were set up on the old machine. This comes in handy when you develop a set of connected pages on your own computer and then move them to your server when they are all set up. Relative links are also shorter and therefore easier to enter (the tag is shorter and you do not need to include the entire URL, just the filename). Use relative links to connect your own documents and call your own images.

4.B.7 Images

To add an image to your document, you must first convert it into a digital format of which GIF is the most common. To make things easier on yourself, save the images that you want to show in your document into the same directory as your document (it is also possible to display a GIF image that is stored almost anywhere). The HTML command for inserting an image at the current position takes the following form:

<img src="/name_of_image.gif">

In the example, "/name_of_image.gif" is the name of the file which contains the graphic. If the image is not saved in the same directory as the linked document, this name is the URL where it is located.

You can even make graphics a hyper link by combining tags. (Note: the following example uses local file references or relative links, in which case the complete URL is not necessary.)

<a href="/homepage.html"> <img src="/image.gif"> </a>

4.B.8 HTML Editors

There are various software tools which can simplify your task of writing consistent and readable web pages. These tools are stand-alone HTML editors (separate applications that allow you to mark up new or existing text files). A flood of good programs are now available. Examples include:

Some tools can be integrated into word processing applications you’re already familiar with: these are style sheets. Most require that you also use a Web viewer to preview your documents.

Whether you’re creating a Web site for a major corporation or just putting up a few personal pages, you need an HTML editor or translator that suits the way you work. Most of the programs listed in this section are very new and, as new software tends to be, are not yet very standardized. It is a good idea to browse the Web and download freeware versions of editors available so that you can find the one right for you. (We found a program called HTML Assistant Pro at ftp://ftp.cs.dal.ca/htmlasst/htmlasst.zip that works well for our purposes.) Also, Special Edition Using HTML from the Que Corporation contains a CD with freeware, shareware and public-domain software.

4.C. Managing Internet Capabilities

This guide identifies many applications and tools to get the most out of Internet access. But each organization must approach new Internet capability with a clear sense of how to use it most effectively. This is particularly true for government agencies with tight staff situations. A supervisor in one local government recently warned, "I don't have the staff time to answer the public requests I get now; how will I handle one more avenue for contact?"

4.C.1. Answering Data Requests.

The Internet can reasonably make answering data requests much easier than in the past. Making oft-requested information available on line reduces support staff time for answering the request and mailing the information or document. This is particularly true for information such as GIS maps which currently require copying the data onto magnetic tapes and mailing. Consider the following tips to insure that best advantage is made of this technology.

4.C.2. Monitoring Use of the Internet (Hits).

Everytime someone accesses your homepage, that information is recorded by the Internet provider. Every time a visitor opens a file (initiating a link), that is called a "hit". You can get a report from the server that tracks use of your homepage and tells you something about the computer accessing it. This allows you to amend the homepage based on how people are actually using it. For example, Sierra Net provides the following information in an effectiveness report:

4.C.3. Making Money on the Internet.

The most prevalent way to make money on the Internet is to promote a product which then is purchased at a store or is ordered on line. Most businesses find that a presence on the Internet increases sales of products which used to be advertised in other ways. Even though all sorts of information can be placed on the Web, a survey done by the Gartner Group indicated that customers want certain types of information from websites: software updates and fixes, customer service, and specific, factual product information. (Greenberg, Ilan. "Most Web sites fall short of user expectations, says study," InfoWorld, November 27, 1995)

In addition to on line advertising and on line ordering, many companies make money by providing Internet services such as: internet service provider (most of these are for-profit businesses), creating homepages, locating information, and troubleshooting technical problems.

4.D. Keeping Hackers Out of Your Computer

The computer hacker is fast becoming a standard character in thriller books and movies. This character is young with a glazed look in his eye (in thriller books and movies it is usually a "he") and an evil intent for the destruction of the world. A more salient concern for most organizations considering using the Internet is the security of proprietary data or files of any type. Awarding access to your computer system to only those you intend is the goal of security systems. Chapter 2 briefly discusses use of passwords and firewalls. They are presented in more detail here.

4.D.1. Passwords.

Passwords are used throughout the Internet to limit access to particular sites. Many Internet servers require each user to register before accessing files or data; usually the password in these situations is assigned by the server. Some Internet providers allow the user to choose a password. As with any user id, such as a PIN number for use with a bank card, take care to choose a word or number which is hard to guess. Specifically, do not use words or numbers which are easy to relate to you (e.g. your birthday, initials, name of your street, etc.). It is a good idea to choose number/letter combinations if possible.

Whether you choose the password or it is assigned, be careful to keep it secret. The security of the system depends on the reliability of its users. Specifically:

Because most user password systems can be corrupted by careless users, this security system is not effective for guarding sensitive data. If you control data which is vulnerable to tampering, choose another security system for its protection.

4.D.2. Firewalls.

A firewall is the most secure system for organizations with direct Internet links. As described in Chapter 2, a firewall is a program which severely limits access into and out of specific networks. The firewall is placed between the internal network and its connection to the Internet (the external network). If you are working on the internal network, most firewalls require that you log into the firewall first, prove you are who you say you are with a password, then exit to the Internet. Usually, this system allows you to then fully utilize Internet capabilities. If you are on the external network, the firewall prohibits access to most internal systems, generally allowing only email. Dedicated hackers have been known to use email technology to crack a firewall, yet newer versions of available programs are considered "secure".

4.D.3. External Servers.

Firewalls work well when no access to a site's data is desirable. But what if you want users to be able to gather some information, but not all information? The most secure way to insure this type of access is to establish an external server. In this arrangement, the organization with the sensitive system duplicates files on a predetermined schedule and sends them to a remote computer. Access to the server is generally available to all for file transfer. This system is used in some government organizations where public access is necessary to certain files (e.g. development applications), but concerns about unapproved file manipulation exist. The public has access to public information in the file at the server, but not the official copy.

4.E. Java

Suddenly, which is the way things happen on the Internet, Java is a common word. Most casual web users will never work with this programming language, but they will benefit from its ability to make web designers creative. Java is an object-oriented language similar to C++. Though developed earlier by Sun for other purposes, it is finding widespread acceptance on the Web as a way to design compelling applications (known as applets) that are transportable to different platforms. When a java-enhanced Web browser encounters an html tag, it downloads the corresponding code and then compiles and runs the applet on the local client machine. At this early stage of development, this enables web pages to come alive with animation, audio and other simple effects. However, this represents a step towards the vision of the Web PC, one in which all computers would download and run the same application code regardless of the operating system used rather then store it on a local hard drive. Though considered safe by its developers, security is still a major issue.

Internet Resource Guide
Table of Contents



Last Update May 8, 1997