
As described previously, all electronic communications based on TCP/IP protocols operate on the Internet. This includes a truly amazing amount of data, only a tiny part of which you want at any given moment. Before you can effectively exploit any of the services offered by the Internet you must be aware of both the existence of the service and the host or hosts on which it is available. Adequately addressing this "resource discovery problem" is a central challenge for both service providers and users wishing to capitalize on the possibilities of the Internet. The most common types of tools to help you navigate around the Internet are: Archie, Gopher, WAIS (pronounced "ways") and the World Wide Web. Simple descriptions of these programs are:
The archie system was created by a group at McGill University in Montreal to automatically track anonymous FTP archive sites, and this is still its primary function. The system currently makes available the names and locations of some 1,500,000 files at some 900 archive sites. Archie's User Access component allows you to search the "files" database for these filenames. When matches are found, you are presented with the appropriate archive site name, IP address, the location within the archive, and other useful information. You can also use archie to "browse" through a site's complete listing in search of information of interest, or obtain a complete list of the archive sites known to that server. The archie server also offers a "package descriptions" (or "whatis") database. This is a collection of names and descriptions gathered from a variety of sources and can be used to identify files located throughout the Internet, as well as other useful information. Files identified in the whatis database can then be found by searching the files database as described above. You need to know the filename or a close approximation of it to use archie and, due to its popularity, many archie servers are slow to use.
Users with direct Internet connectivity can try out an interactive archie server using the basic "telnet" command (available at most sites). To use, telnet to the host "archie.mcgill.ca" and login as user "archie" (there is no password needed). A banner message giving latest developments and information on the archie project will be displayed and then the command prompt will appear. First-time users should try the "help" command to get started.
Gopher is a menuing system designed to access different sites on the Internet. The name originates both from a pun on the phrase "go fer" and in reference to the industrious, hardworking mammal. Gopher can be thought of as similar to the directories on your computer; they can contain files or other directories. It is menu driven, which allows each choice to follow logically from the one before. The first version of gopher was written for UNIX systems, but many versions for all computer platforms are now available. You may have a gopher program on your computer, but you may also access one remotely by telnetting to a gopher server. Because nearly all gopher servers have references to each other, you may pick one close by and not have to worry about missing something.
To begin gopher, you either click an icon, or type in the word "gopher", if you have a resident program, or telnet to a gopher server and logon as "gopher". This displays the first menu which offers choices such as: information about gopher, discussion groups, fun and games, library, news, and search gopher titles at the University of Minnesota. These items are either other menus or search items with links to other gopher servers. As you move through the menus, you end up with lists of files containing information you want to access. You then choose a file, look at it, and tell gopher what you want to do with it; email it to you, copy it to you (using FTP), or print it. (If you telnetted to get to the file, you may only choose email.) Although gopher and archie originated with very different goals, they have evolved to be complimentary programs. Through use of a gateway , you may use gopher to retrieve files found by archie.
WAIS (pronounced "ways") stands for Wide Area Information Servers and is designed to retrieve information from networks. It originated as a project to find and retrieve databases by experts at Thinking Machines, Apple Computer, and Dow Jones. Unlike other retrieval tools, WAIS looks at the contents of documents rather than just the titles. WAIS versions exist for all computer platforms.
WAIS organizes its data based on keywords; from keyword choices, it lists the servers which contain relevant information. You may access short descriptions about the server to confirm whether you want to search that database, then return to the server list screen and select each for the search. WAIS then displays another list of choices which may contain the information you are looking for. When you find something interesting, selecting it prompts WAIS to retrieve it and open it for viewing. You may then email it to yourself if you want to save it. While this system is unique in its ability to search for contents, it is time consuming to use the search features. WAIS can be used in conjunction with both archie and the World Wide Web to improve its functionality.
The World Wide Web greatly improves the ability to organize and retrieve information on the Internet by acting more like a human in its search process. When we mortals are engaged in a research project, we initially consult one reference, then use its bibliography to go directly to other sources of information. Using hypertext links, the World Wide Web behaves the same way. Keywords in one document are highlighted; when they are chosen, the WWW activates a link which sends the user directly to another file. No menus or command choices are necessary. Of course, these links operate only as well as the creators made them, but the opportunities for dynamic information retrieval are endless.
This review of resource discovery tools presents four programs which seem to do about the same thing. John R. Levine and Carol Baroudi in Internet for Dummies provides some rules of thumb for when to use what tool that are invaluable. They are (Levine and Baroudi, Internet for Dummies, IDG Books, 1994, p. 283):
A browser is a program that lets you access and display information on the World Wide Web. There are many browser programs; the most popular are described below.
Mosaic is the most widely used browser. It is a graphical WWW browser, originally developed at the University of Illinois, which functions on Windows, Macs, and UNIX (under X Windows) systems. Mosaic displays images and plays sounds, with the help of local utilities. Navigation within the web is intuitive and additional features (mailing feedback, customizing, etc.) are easy to use. The links to other documents can be activated by clicking on highlighted words or choosing graphical icons. Mosaic also provides an interface to the other information systems (WAIS, Gopher, etc.) thus giving access to all Internet resources from a single interface. Its primary deficit is that it is slow. Mosaic designers (and there are now several as the initial version has been licensed to many other programmers) recommend a minimum 486/33 computer with 8 megabytes of RAM, but it still takes some time to load the multi-media pages. This can be overcome by rejecting the graphics and viewing only the text information.
A good alternative for users without a graphical environment is Lynx . Lynx is an ASCII terminal (UNIX) browser for WWW using arrows and tab keys, cursor addressing and highlighted or numbered links to navigate within the web. Lynx has no image or sound capabilities: any images or sounds are replaced by a tag at display time and the corresponding files can be retrieved separately. Unlike the line mode browser, documents containing images or enhanced document formats are handled properly by Lynx. Lynx can be used to access information on the World Wide Web, or to build information systems intended primarily for local access. For example, Lynx has been used to build several Campus Wide Information Systems (CWIS). In addition, Lynx can be used to build systems isolated within a single LAN. A demonstration version of Lynx is available using Telnet to: www.cc.ukans.edu (login as www).
Netscape is a WWW browser for MS Windows, Macintosh, and X that has gained popularity. Netscape Communications was founded in 1994 by Dr. James Clark, founder of Silicon Graphics. However, Netscape was created by Marc Anderson (the original creator of Mosaic) originally as "Motzilla." That is why there are so many pictures of that cute little dragon on their web pages.
Netscape Navigator is compatible with other HTTP-based clients and servers. It uses principles of point-and-click network navigation and is designed to run on powerful modems. Netscape Communications offers a full line of open software to enable electronic commerce and secure information exchange on the Internet and private TCP/IP based networks. The software line includes three families of products: Netscape Navigator client software, Netscape Server software, and Netscape Internet Applications. The products deliver secure communications, advanced performance and point-and-click simplicity to companies and individuals who want to create or access information services on the Internet or private TCP/IP networks. Netscape software products offer easy-to-use interfaces for serving and accessing multimedia information on the net, including formatted text, graphics, audio and video. The products are based on industry standard protocols and are fully compatible with other HTTP-based clients and servers.
Samba is a suite of programs which work together to allow clients to access UNIX file space and printers via the SMB (Session Message Block) protocol. In practice, this means that you can redirect disks and printers to UNIX disks and printers from Lan Manager clients, Windows for Workgroups 3.11 clients, Windows NT clients and OS/2 clients. There is also a UNIX client program supplied as part of the suite which allows UNIX users to use an FTP-like interface to access file space and printers on any other SMB servers. This system does not support Macintosh networks.
Other browsers include MacWeb (ftp://ftp.einet.net/einet/mac/macweb/ macweb.latest.sea.hqx), TCP/Connect II (http://www.intercon.com/download.html) and Enhanced Mosaic (http://www.vmedia.com/ or http://www.digital.com/), plus many more.
Search engines are the programs which search through information on the World Wide Web. They work in tandem with browsers; a browser program often opens and uses the commands in a search engine to find and list references in response to search perameters. For example, CompuServe users can download a version of the browser program Mosaic. Mosaic will locate resources on the World Wide Web based on their addresses; but what if you don't know exactly where to look for something? You can then choose the "search" command and Mosaic starts Yahoo, a popular search engine. By choosing general topics from a list, or typing in key words, Yahoo begins to search for matching information. The search results in a list of possible sites from which you can choose.
Finding information requires that you think about what you are looking for beforehand and that you understand how to make your query:
Search engines know about a lot of documents, so it helps to precisely specify your search. Often, though, you can be too precise, so finding what you want may take a couple of tries. Here are some suggestions about what to do when you don't get what you want, and how to phrase your search in a way that will produce the best results. What to do when...
Like browsers, many search engines exist to help find what you are looking for. The following general descriptions identify the most popular programs.
Available at http://www.yahoo.com/, David Filo and Jerry Yang at Stanford University developed Yahoo (it stands for "Yet Another Hierarchically Organized Oracle"). It provides a table-of-contents type of interface to Web (no Gopher) services, along with a more specific search tool. One of your choices in a menu bar across the bottom of the screen will be "Search." Select it, and you'll get a small form with a list of links to other searchable databases of Web resources. Yahoo!'s search engine enables users to find information about topics of interest through simple keyword queries. Searches can be restricted to titles, Uniform Resource Locator (URL) addresses or comments. Boolean searches can also be performed. Search results are returned along with their locations within Yahoo!'s hierarchical index.. Although Yahoo will let you find all sorts of resources, it really shines in the area of online businesses and the services they offer.
The WebCrawler is composed of three essential pieces:
Infoseek Corporation was founded in January, 1994 by Steven Kirsch, founder of FramTechnology and Mouse Systems, and an experienced team of professionals from the high tech industry. The Infoseek team envisioned an innovative way to meet people's information needs using the Internet.
Infoseek's World Wide Web index, used in Infoseek Net Search, Infoseek Guide and Infoseek Professional, takes advantage of the dynamic nature of the Web. Infoseek also works closely with corporate librarians and heavy users of information to identify the most pressing types of information necessary for success in their careers. Infoseek follows through to make that information available online through the special content collections of Infoseek Professional.
Infoseek technology provides "pinpoint search precision" while being easy-to-use. This technology underlies all of Infoseek's products. Infoseek Guide is the first Internet navigation tool that fully integrates this type of search precision with basic browse capabilities to help users navigate through the Internet.
When you find something interesting on the Internet, the next step is retrieving it. FTP, or File Transfer Protocol, is what to use to retrieve a text file, software, or other item from a remote host and copy it to your computer. (The word "FTP" is Internet jargon for an acronym used as both a noun and a verb. It's a little awkward, but those who use it know what they mean.) The objectives of FTP are:
Normal practice is to ftp to the host you want and login with your user id and the host's password. Once you are logged onto the host computer, you may search directories and view files, then copy them to your computer. Of course, most systems have protections on their files and directories that limit access and copy capabilities in some ways. FTP, in fact, requires that you have an account with the machine you are accessing. Public sites which offer programs free of charge use a system called "anonymous FTP", described below.
FTP recognizes six types of files, of which only two are commonly used: ASCII (text files), and binary (all other kinds of files). The most common error in using FTP is choosing to transfer a file in the wrong mode. When you transfer an ASCII file between different kinds of computers that store files differently, ASCII mode automatically adjusts the file during the transfer so that the file is a valid text file when it is stored on the receiving end. A binary file is left alone and transferred verbatim. An ASCII file can, therefore, be transferred in ASCII mode, but a garbled mess results if a binary file is transferred that way. If you are FTPing files between two computers of the same type, such as from one UNIX system to another, you can and should do all your transfers in binary mode. Whether you're transferring a text file or a nontext file, it doesn't require any conversion, so binary mode results in a more error-free transfer. Transferring in binary mode also retains text formatting.
Anonymous FTP was developed to allow easy program and file transfers even when the recipient is not known to the host. Freeware available for downloading uses this technique. The basic system is the same as FTP, except when it prompts you for a user name, you enter "anonymous", typing in your address as the password. Many anonymous FTP sites establish time limits and/or number of users at one time, so you may have to try more than once to access desirable sites.
By far the most widely used interactive Internet services are the various forms of remote login. What these services do is simple: you login to a remote host as if your terminal were attached directly to that host. Because all hosts on the Internet are officially equal, you can login to a host on the other side of the world as easily as one down the hall.
One step beyond electronic mail is the ability to control a remote computer using Telnet . This feature lets you virtually teleport anywhere on the network and use resources located physically at that host.
Telnet (along with FTP) has been around since the beginnings of the Internet. It provides the most basic kind of computer connectivity. It lets you sit at one computer and use another computer somewhere out there on the Internet, just as if you were sitting in front of that other computer. More accurately, as if you were using an old-fashioned command-line terminal directly connected to that computer.
For example, if you are anywhere in the world with a computer on the Internet, you can Telnet from that computer to one at, for example, University of California at Davis. You will then be able to use the UCD computer just as if you were on campus. (Enlightened academic conferences are beginning to provide clusters of terminals for just this purpose.)
Telnet has been integrated into the World Wide Web, but only to a limited extent. A link on a Web page can be set up to open a Telnet connection with some specific computer. If you follow that link, however, you drop off the edge of the Web. You can use the remote computer, but you will have to do so by following its rules and procedures. Ordinarily, the Telnet links that you come across in the Web will be to public sites that are designed to be easy to use.


Last Update May 8, 1997