8.6 The World Wide Web
VIII. World Wide Web
1. World Wide Web Definition: an ambitious client-server system that offers a
simple, consistent interface to vast resources of the Internet.
* It is the name given to a large collection of internet accessible text,
pictures and data, that can contain one or more links (highlighted and
colored text or icons) to another data.
* Such text, pictures, and data containing links is called hypertext.
* The World Wide Web represents a network of links.
* Originators: Vannevar Bush [45 Memex information machine]
Ted Nelson [81 Xanadu],
Apple [87 HyperCard ]
Tim Berners-Lee, Robert Cailliau [89- CERN project]
WWW announced on Usenet 5/91 [92 Lynx ]
Marc Andreesen [93 X-Mosaic] and then PC Air Mosaic
and MacMosaic
* Client program (Browser) runs on your computer which contacts a remote Server
program.
* Now Browsers not only act as Web clients, but also gopher, ftp, Usenet, and
mail clients
2. The Web Definition: An enormous Internet-based information utility
encompassing not only Web servers offering hypertext data, but all the gopher
servers on the net, all the anonymous ftp sites, all the Usenet discussion
(news)groups and mail.
3. Note: The most important aspect of the Web is not the links that connect
one data item to another, it is the content between the links.
4. Information Dissemination in Communication Systems:
Disadvantage: Limited large group participation (response)
*[Snail/Express] Mail [Discrete, one-one, different time different place]
+Fax machines [Fast Discrete, one-one, different time different place]
*Telephone [Discrete, one-one, same time different place]
+Answering machines [Discrete, one-one, different time different place]
*Radio [Continuous update, small group-many, same time
different place]
*Television [Continuous update, small group-many, same time
different place]
*The Newspaper [Daily or less Update, small group-many, different time
and place]
*Magazines [Weekly or less Update, small group-many, different
time and place]
*Books [Yearly or less Update, one-many, different time
and place]
Internet Benefits:
* Potentially universal participation (interaction)
* Nearly instantaneous support of one-one, one-many and many-many
connections
* Satisfies humanity's need (urge) to communicate
*E-mail [Discrete, one-many, different time and place]
*E-mail Mailing Lists [Updated, many-many, different time and place,
-less accessible]
*Usenet [Updated, many-many, different time and place,
-limited mesg time]
*Gopher [Updated, many-many, different time and place,
menu based, + orgs]
*The Web [Updated, many-many, different time and place,
+way people think]
* The Web is the 4th effort to connect mankind on the Internet.
+ It is an information system with words, ideas, pictures, with data type
expansion
+ It has links to let you jump from one item to another
+ It has a user interface that is familiar and easy to use
+ It has client programs for PCs, Macintoshes used by the majority of people
5. Using the Web:
*1. Procedure:
* Invoke a Network connected browser (i.e. Mosaic, Netscape, Explorer, Lynx)
* In the File menu, open a location (url) to view: Type its address
e.g. http://www.yahoo.com
* The browser fetches the information to your system and displays it on your
screen
* Items that are highlighted via underline (lynx) or in color are links to
elsewhere
* Use the arrow keys to move and <CR> to invoke (lynx) or
mouse to move and click to invoke a link
Subjectively, you are "going on tangents" or "surfing", "visiting" or
"navigating" the web.
Behaviorally, you are always either (1) Reading Text [20% est.]
(2) Looking at pictures or animation [4% est.] (3) Issuing commands [1% est.]
or (4) WAITING [75% est.]
*2. Notes:
*The Web is a mixture of two kinds of sites: (1) created by individuals about
themselves (2) containing information about an organization(s).
*When you move your mouse onto a highlighted link, your browser shows the URL
associated witht the link
* Web Page: The Contents of a single file containing hypertext links to other
pages
Web Site: A host containing one or more pages (related or not)
Home Page: The main [menu] page for a Web Site; Also the starting
page that can be loaded when your browser is launched
* Most browsers let you see, copy and save hypertext source for the viewed
web page.
* Use a word processor to convert existing documents into HTML
* Use an HTML editor with integrated browser to create new Web files.
* Create a Web page with your Bookmark Lists so you can greater describe them.
* On UNIX systems, HTML files, and their subdirectories should have permissions
set to 704, 705 respectively: public read only public directory read and access
only
*3. Uniform Resource Locators (URLs)
* Unique addressing for hypertext and other network items or services
( gophers, anonymous ftp sites, Usenet newsgroups, wais databases)
* Format: protocol://hostname[:port]/path or
protocol:description (used by news and mailto)
Protocols Meaning
afs File accessed via the Andrew File System
cid Content identifier for a Mime body part message
file Access to a [local] file
ftp File accessed via [anonymous] ftp
gopher Gopher resource
http Hypertext Transfer Protocol resource
mailserver Data accessed via a mail server
mailto Mail a message to a specified address
mid Identifier of a specific mail message
news Usenet Newsgroup
nfs File accessed via the Network File System
nntp Usenet news for local NNTP access only
prospero Resource accessed via a Prospero directory server
rlogin Interactive rlogin session
telnet Interactive telnet session
tn3270 Interactive 3270 telnet session
wais Access to a Wais database
z39.50 Access to a database via a Z39.50-type query
*4. Invoking a URL
(1) Autoload it in advance when the browser is launched
(2) Pull down a menu item to "open" a URL and type it in the dialog box.
(3) Type the URL directly in the browser's box near the top of the window
(4) Click on the builtin browser buttons assigned to specific resources
to go there.
(5) Choose an item from your bookmark list and/or history list
*5. To stop a web page from loading
(1) Press STOP button on the browser
(2) Press the animated icon (upper right) on the browser to bring back the
home page
(3) Press the escape key <ESC> on your keyboard (lynx)
(4) Select another URL and invoke it while the current one is loading
*6. Types of Links
* Highlighted words (Representing URLs)
* Small graphical element like a button or picture (Representing URLs)
* Interactive Form: you can enter information that is submitted to remote
web site to be processed by an existing CGI Script (CGI = Common Gateway
Interface) on the remote server.
* Image Map: A picture (photo or drawing) menu where the various parts of
the image act as separate links.
Discrete: A collage of several small pictures
Continuous: Clicked mouse pointer coordinates determine what is
downloaded next
*7. History List and Bookmarks
* The History List is maintained for the current browser session, showing
where you have been in reverse order. There are also buttons on the browser
that help you go "back" or "Forward" (left and right arrows are possible).
Successive "back" clicks take you in reverse towards your original home page.
* The History List can be shown and an entry directly selected by mouse click.
* The Bookmark List (Hot List) contains your saved Web addresses of all your
sessions with this browser.
- Your browser can save this Bookmark List to a file
- Your browser can let you edit the Bookmark List to organize this
into categories
*8. Inline and External Images
* An Inline Image is a picture that is part of a web page.
It is usually a link to a file containing the full (external) image.
* An External Image is a picture that is itself a web page, shown in its own
window
* Images are stored in formats like GIF and JPEG. To display these, your
browser must invoke a helper application (Plug-in) that acts as a GIF or
JPEG Viewer (Your Browser can be made aware of these in its configuration or
preferences area)
*9. Sounds and Video
* Sound is stored in formats like AU (audio) and WAV (Waveform)
* Video is stored in formats like MPEG (Motion Picture Experts Group),
Quicktime and AVI (Audio/Visual Interleaved Data)
*Regular sound files are downloaded, stored and then played;
*Real-time sounds are played as they are downloaded but not retained.
*10. Updating Information automatically
* Client Pull: Your browser is told by the Web page to reload itself
every n seconds This refreshes the information and requires
repeated browser connection
* Server Push: The Web server sends new data on its own to your browser
This refreshes the information and the browser-server connection is
kept open indefinitely.
* Requirements: A Web page must specifically request client pull, a server
must be designed to offer server push, your browser must know how to support
these features.
* Examples: Web based talk facilities, Financial Services Stock quotes
*11. Web Directories and Search Engines
* Web pages containing a list of other Web Sites by Category
* Web pages containing a list of Search Engines invoked by filling in a form
* Search Engines supply a list of links based on your keyword(s) in order of
most closely matched links first. Search engines/tools have two components:
collection and search. The collection (also known as automated robot, Wanderer,
Spider, Harvest and Pursuit) part roams internet sites, mostly www, gopher and
ftp sites, brings back resources, sorts, indexes and creates a database out of
them.
* Boolean Searches permit AND and OR operators on search item keywords
* Examples: Web Directories
Yahoo! http://www.yahoo.com
Magellan http://www.mckinley.com
Point http://www.pointcom.com
W3 Servers http://www.w3.org/hypertext/DataSources/WWW/Servers.html
Metroscope http://isotropic.com/metro/scope.html
Yanoff http://www.uwm.edu/Mirror/inet.services.html
* Search Engines:
AltaVista http://www.altavista.digital.com
Lycos: http://www.lycos.com
WebCrawler: http://www.webcrawler.com
Inktomi: http://inktomi.cs.berkeley.edu:1234/
DejaNews: http://www.dejanews.com/forms/dnq.html
World Wide Web Worm: http://wwww.cs.colorado.edu/wwww/
excite NetSearch: http://www.excite.com/
* Searching Search Engines (Multiple Simultaneous Queries):
SavvySearch: http://cage.cs.colostate.edu:1969/
IBM InfoMarket Search: http://www.infomkt.ibm.com/
MetaCrawler: http://metacrawler.cs.washington.edu:8080/index.html
ProFusion: http://www.designlab.ukans.edu/ProFusion.html
*12. Customizing your Browser environment by modifying:
* Text size and typeface and background, foreground Colors
* Links Display Characteristics (Color, underlining)
* Home (Startup) Page Turn off, on, Specify a URL to load
* Windows size and shape
* Window Manipulation characteristics: Multiple windows,
iconized windows
* Image Loading: Turn off, on automatically
* Viewers: Indicate new programs to read special formatted data
* Toolbar(s): Show all, none, customize yours
* Bookmark List: Add, delete, edit links, also save to a file,
categorically organize them
* The Cache: Choose: in memory (single session) or disk stored
(multisession), when to flush (the cache). [ Flush often
via browser or manually to save disk space]
*13. Hints for using URLs
* Don't change upper and lower case characters within URLs (Hostnames are
case insensitive, but pathnames are case sensitive)
* Transcribe URLs carefully: use copy and paste whenever possible
* Recognize Common host name patterns: www for Web Servers, ftp for anonymous
ftp servers and gopher for gopher servers
* You can guess the URL by typing the host name only and hope for a home page
* If the URL doesn't work, shorten the name from the right to the slash until
it does.
* On some browsers, http:// is assumed, so you can just say: www.host.com
* Type the entire URL when sending a URL by mail
* Files whose names end with .html or .htm are hypertext files
*14. Java: A Language For Distributed Applications
* Java: an object oriented programming language ("C++ without the pointers")
* Language applets are Computer System independent
* The idea: A Web page containing a very small program (applet) that your
browser downloads and executes dynamically while your web page is displayed
* Examples include animation, rotatable graphics, games, calculation, and
other special effects
* History: - Developed by James Gosling (Sun) in 1992 for Consumer Computer
products (PDAs)
- HotJava Browser developed in 1993
- Official Announcement for Java Technology May 1995
- Adopted by NetScape (2.0) and Microsoft Explorer (To be)
* Features: - Highly Interactive,
- Uses speed of your local computer, not the network or modem
- Both an Interpreted (at runtime on Client browser "Viewer")
and Compiled language (on originating system)
- Architecture Neutral byte code (Computer System independent)
- Portable
- Multithreaded
- Extensible
- Dynamic data types (links specialized Java Application to server which supports the new data type)
- Dynamic Protocols (Seamless integration of new protocolsi
with existing ones like electronic commerce)
- Security-oriented: (1) No inherent semantics for altering a
computer environment; (2) restricted access to system level
services, (3) UNIX directories accessed only:
/tmp/hotjava and ~/.hotjava via Environment variables
(4) discerns applets inside and outside firewalls,
(5) discerns socket and file manipulation by applet.
(6) Digital Signature assignment to classes verification
available before loading.
6. Lynx: A text based Web browser
* No complex data types means faster downloading
* Operating Lynx remotely in a UNIX Shell means faster downloading
* Lynx permits you to concentrate on the information
* Lynx can open frequently referenced files that were downloaded and
saved locally
7. Lynx Command Syntax (a .lynxrc file may be used to customize lynx)
$ lynx [options] [<URL> or pathname ] # lynx file acts like a UNIX browser
Options are:
-anonymous used to specify the anonymous account
-auth=id:pw authentication information for protected forms
-case enable case sensitive user searching
-cache=NUMBER NUMBER of documents cached in memory. (default is 10)
-cfg=FILENAME specifies a lynx.cfg file other than the default.
-display=DISPLAY set the display variable for X "exec"ed programs
-dump dump the first file to stdout and exit
-editor=EDITOR enable edit mode with specified editor
-emacskeys enable emacs-like key movement
-error_file=file write the HTTP status code here
-fileversions include all versions of files in local VMS directory listings
-force_html forces the first document to be interpreted as HTML
-ftp disable ftp access
-get_data User data for get forms, read from stdin,terminated by
'---' on a line
-help print this usage message
-homepage=URL set homepage separate from start page
-index=URL set the default index file to URL
-localhost disable URLs that point to remote hosts
-mime_header Show full mime header
-nobrowse disable directory browsing
-noprint disable print functions
-noredir Don't follow Location: redirection
-nostatus disable the miscellaneous information messages
-post_data User data for post forms, read from stdin, terminated by
'---' on a line
-print enable print functions (DEFAULT)
-restrictions[=options] use -restrictions to see list below
-rlogin disable rlogins
-selective require .www_browsable files to browse directories
-show_cursor don't hide the curser in the lower right corner
-source dump the source of the first file to stdout and exit
-telnet disable telnets
-term=TERM set terminal type to TERM
-trace turns on WWW trace mode
-version print Lynx version information
-vikeys enable vi-like key movement
USAGE: lynx -restrictions=[option][,option][,option]
List of Options:
all restricts all options.
default same as commandline option -anonymous. Disables default
services for anonymous users. Currently set to, all restricted
except for: inside_telnet, outside_telnet, inside_news,
inside_ftp, outside_ftp, inside_rlogin, outside_rlogin, goto,
jump and mail. Defaults settable within userdefs.h
bookmark disallow changing the location of the bookmark file.
bookmark_exec disallow execution links via the bookmark file
disk_save disallow saving binary files to disk in the download
menu
download disallow downloaders in the download menu
editor disallow editing
exec disable execution scripts
exec_frozen disallow the user from changing the execution link
file_url disallow using G)oto to go to file: URL's
goto disable the 'g' (goto) command
inside_ftp disallow ftps for people coming from inside your domain
inside_news disallow USENET news posting for people coming from
inside your domain
inside_rlogin disallow rlogins for people coming from inside your
domain
inside_telnet disallow telnets for people coming from inside your
domain
jump disable the 'j' (jump) command
mail disallow mail
news_post disallow USENET News posting setting in the
O)ptions menu
option_save disallow saving options in .lynxrc
outside_ftp disallow ftps for people coming from outside your domain
outside_news disallow USENET newsposting for those coming from
beyond your domain
outside_rlogin disallow rlogins for people coming from outside your
domain
outside_telnet disallow telnets for people coming from outside your
domain
print disallow most print options
shell isallow shell escapes, lynxexec, and lynxcgi G)oto's
suspend disallow Control-Z suspends with escape to shell
telnet_port disallow specifying a port in telnet G)oto's
$ lynx -version
Lynx Version 2-4-2
(c)1995 University of Kansas
<lynx-help@ukanaix.cc.ukans.edu>
8. LYNX Cursor Commands
MOVEMENT:
Down arrow - Highlight next topic
Up arrow - Highlight previous topic
Left arrow - Return to previous topic
Right arrow - Jump to highlighted topic
Return, Enter - Launch Selection
SCROLLING:
+ (or space) - Scroll down to next page
- (or b) - Scroll up to previous page
OTHER:
? (or H) - Help (this screen)
a - Add the current link to your bookmark file
c - Send a comment to the document owner
d - Download the current link
e - Edit the current file
g - Goto a user specified URL or file.
i - Show an index of documents
m - Return to main screen
o - Set your options
p - Print to a file, mail, printers, or other entity
q - Quit (Asks if you're sure?)
Q - Quit (quick quit)
^D - Quit Lynx (quick quit)
/ - Search for a string within current document (be at top of
document)
s - Enter a search string for an external search.
n - Go to the next search string
v - View your bookmark file
z - Cancel transfer in progress
[backspace] - Go to the history page
= - Show file and link info
\ - Toggle document (HTML) source/rendered view
! - Spawn your default shell
^R - Reload current file and refresh the screen
^W - Refresh the screen
^U - Erase input line
^G - Cancel input or transfer
Questions? Robert Katz: katz@ned.highline.edu
Last Update December 7, 1999