8.6 The World Wide Web

VIII. World Wide Web

1. World Wide Web Definition: an ambitious client-server system that offers a 
simple, consistent interface to vast resources of the Internet.
* It is the name given to a large collection of internet accessible text, 
pictures and data, that can contain one or more links (highlighted and 
colored text or icons) to another data.  
* Such text, pictures, and data containing links is called hypertext.  
* The World Wide Web represents a network of links.
* Originators:	Vannevar Bush [45 Memex information machine]
			Ted Nelson [81 Xanadu], 
			Apple [87 HyperCard ]
			Tim Berners-Lee, Robert Cailliau [89- CERN project]
			WWW announced on Usenet 5/91 [92 Lynx ]
			Marc Andreesen [93 X-Mosaic] and then PC Air Mosaic
 			and MacMosaic
* Client program (Browser) runs on your computer which contacts a remote Server
program.
* Now Browsers not only act as Web clients, but also gopher, ftp, Usenet, and 
mail clients 

2. The Web Definition: An enormous Internet-based information utility 
encompassing not only Web servers offering hypertext data, but all the gopher 
servers on the net, all the anonymous ftp sites, all the Usenet discussion 
(news)groups and mail.

3. Note: The most important aspect of the Web is not the links that connect 
one data item to another, it is the content between the links.

4. Information Dissemination in Communication Systems:

Disadvantage: Limited large group participation (response)

*[Snail/Express] Mail	[Discrete, one-one, different time different place]
+Fax machines		[Fast Discrete, one-one, different time different place]
*Telephone		[Discrete, one-one, same time different place]
+Answering machines 	[Discrete, one-one, different time different place]
*Radio			[Continuous update, small group-many, same time 
				different place]
*Television		[Continuous update, small group-many, same time 
				different place]
*The Newspaper		[Daily or less Update, small group-many, different time 
				and place]
*Magazines		[Weekly or less Update, small group-many, different 
				time and place]
*Books			[Yearly or less Update, one-many, different time 
				and place]

Internet Benefits: 
* Potentially universal participation (interaction)
* Nearly instantaneous support of one-one, one-many and many-many 	
connections
* Satisfies humanity's need (urge) to communicate

*E-mail			[Discrete, one-many, different time and place]
*E-mail Mailing Lists [Updated, many-many, different time and place,  
				-less accessible]
*Usenet			[Updated, many-many, different time and place,  
				-limited mesg time]
*Gopher			[Updated, many-many, different time and place, 
				menu based, + orgs]
*The Web		[Updated, many-many, different time and place, 
				+way people think]

* The Web is the 4th effort to connect mankind on the Internet.
+ It is an information system with words, ideas, pictures, with data type 
	expansion
+ It has links to let you jump from one item to another
+ It has a user interface that is familiar and easy to use
+ It has client programs for PCs, Macintoshes used by the majority of people

5. Using the Web:

*1. Procedure: 
* Invoke a Network connected browser (i.e. Mosaic, Netscape, Explorer, Lynx)
* In the File menu, open a location (url) to view: Type its address
 		e.g.  http://www.yahoo.com
* The browser fetches the information to your system and displays it on your 
screen
* Items that are highlighted via underline (lynx) or in color are links to 
elsewhere
* Use the arrow keys to move and <CR> to invoke (lynx) or 
		mouse to move and click to invoke a link

Subjectively, you are "going on tangents" or "surfing", "visiting" or 
"navigating" the web.  
Behaviorally, you are always either (1) Reading Text [20% est.]  
(2) Looking at pictures or animation [4% est.] (3) Issuing commands [1% est.] 
or (4) WAITING [75% est.]

*2. Notes: 
*The Web is a mixture of two kinds of sites: (1) created by individuals about 
themselves (2) containing information about an organization(s).
*When you move your mouse onto a highlighted link, your browser shows the URL 
associated witht the link
* Web Page: The Contents of a single file containing hypertext links to other 
pages
	  Web Site: A host containing one or more pages (related or not)
	  Home Page: The main [menu] page for a Web Site; Also the starting 
		page that can be loaded when your browser is launched
* Most browsers let you see, copy and save hypertext source for the viewed 
web page.
* Use a word processor to convert existing documents into HTML
* Use an HTML editor with integrated browser to create new Web files.
* Create a Web page with your Bookmark Lists so you can greater describe them.
* On UNIX systems, HTML files, and their subdirectories should have permissions
set to 704, 705 respectively: public read only public directory read and access
only

*3. Uniform Resource Locators (URLs) 
* Unique addressing for hypertext and other network items or services 
( gophers, anonymous ftp sites, Usenet newsgroups, wais databases)
* Format: 	protocol://hostname[:port]/path or 
		protocol:description (used by news and mailto)

	Protocols 	Meaning
	afs		File accessed via the Andrew File System
	cid		Content identifier for a Mime body part message
	file		Access to a [local] file
	ftp		File accessed via [anonymous] ftp
	gopher		Gopher resource
	http		Hypertext Transfer Protocol resource
	mailserver	Data accessed via a mail server
	mailto		Mail a message to a specified address
	mid		Identifier of a specific mail message
	news		Usenet Newsgroup
	nfs		File accessed via the Network File System
	nntp		Usenet news for local NNTP access only
	prospero	Resource accessed via a Prospero directory server
	rlogin		Interactive rlogin session
	telnet		Interactive telnet session
	tn3270		Interactive 3270 telnet session
	wais		Access to a Wais database
	z39.50		Access to a database via a Z39.50-type query
	
*4. Invoking a URL
(1)	Autoload it in advance when the browser is launched
(2)	Pull down a menu item to "open" a URL and type it in the dialog box.
(3)	Type the URL directly in the browser's box near the top of the window
(4)	Click on the builtin browser buttons assigned to specific resources 
	to go there.
(5)	Choose an item from your bookmark list and/or history list

*5. To stop a web page from loading
(1) Press STOP button on the browser
(2) Press the animated icon (upper right) on the browser to bring back the 
home page
(3) Press the escape key <ESC> on your keyboard (lynx)
(4) Select another URL and invoke it while the current one is loading

*6. Types of Links
* Highlighted words (Representing URLs)
* Small graphical element like a button or picture (Representing URLs)
* Interactive Form: you can enter information that is submitted to remote 
web site to be processed by an existing CGI Script (CGI = Common Gateway 
Interface) on the remote server.
* Image Map: A picture (photo or drawing) menu where the various parts of 
the image act as separate links. 
	Discrete: A collage of several small pictures
	Continuous: Clicked mouse pointer coordinates determine what is 
			downloaded next

*7. History List and Bookmarks
* The History List is maintained for the current browser session, showing 
where you have been in reverse order.  There are also buttons on the browser 
that help you go "back" or "Forward" (left and right arrows are possible).  
Successive "back" clicks take you in reverse towards your original home page.
* The History List can be shown and an entry directly selected by mouse click.
* The Bookmark List (Hot List) contains your saved Web addresses of all your 
sessions with this browser. 
	- Your browser can save this Bookmark List to a file
	- Your browser can let you edit the Bookmark List to organize this 
		into categories

*8. Inline and External Images
* An Inline Image is a picture that is part of a web page.  
	It is usually a link to a file containing the full (external) image.
* An External Image is a picture that is itself a web page, shown in its own 
	window
* Images are stored in formats like GIF and JPEG. To display these, your 
browser must invoke a helper application (Plug-in) that acts as a GIF or 
JPEG Viewer (Your Browser can be made aware of these in its configuration or 
preferences area)

*9. Sounds and Video	
* Sound is stored in formats like AU (audio) and WAV (Waveform)
* Video is stored in formats like MPEG (Motion Picture Experts Group), 
	Quicktime and AVI (Audio/Visual Interleaved Data)
*Regular sound files are downloaded, stored and then played; 
*Real-time sounds are played as they are downloaded but not retained.

*10. Updating Information automatically
* Client Pull: Your browser is told by the Web page to reload itself 
	every n seconds This refreshes the information and requires 
	repeated browser connection
* Server Push: The Web server sends new data on its own to your browser
	This refreshes the information and the browser-server connection is 
	kept open indefinitely.
* Requirements: A Web page must specifically request client pull, a server 
must be	designed to offer server push, your browser must know how to support 
these features.
* Examples: Web based talk facilities, Financial Services Stock quotes
	
*11. Web Directories and Search Engines
* Web pages containing a list of other Web Sites by Category
* Web pages containing a list of Search Engines invoked by filling in a form
* Search Engines supply a list of links based on your keyword(s) in order of 
most closely matched links first.  Search engines/tools have two components: 
collection and search. The collection (also known as automated robot, Wanderer,
Spider, Harvest and Pursuit) part roams internet sites, mostly www, gopher and 
ftp sites, brings back resources, sorts, indexes and creates a database out of 
them. 
* Boolean Searches permit AND and OR operators on search item keywords
* Examples: Web Directories 
		Yahoo! http://www.yahoo.com
		Magellan http://www.mckinley.com
		Point http://www.pointcom.com
	W3 Servers http://www.w3.org/hypertext/DataSources/WWW/Servers.html
		Metroscope http://isotropic.com/metro/scope.html
		Yanoff http://www.uwm.edu/Mirror/inet.services.html
	* Search Engines:
		AltaVista http://www.altavista.digital.com
		Lycos: http://www.lycos.com
		WebCrawler: http://www.webcrawler.com
		Inktomi: http://inktomi.cs.berkeley.edu:1234/
		DejaNews: http://www.dejanews.com/forms/dnq.html
		World Wide Web Worm: http://wwww.cs.colorado.edu/wwww/ 
		excite NetSearch: http://www.excite.com/
	* Searching Search Engines (Multiple Simultaneous Queries):
		SavvySearch: http://cage.cs.colostate.edu:1969/
		IBM InfoMarket Search:  http://www.infomkt.ibm.com/
	MetaCrawler: http://metacrawler.cs.washington.edu:8080/index.html
		ProFusion: http://www.designlab.ukans.edu/ProFusion.html

*12. Customizing your Browser environment by modifying:
	* Text size and typeface and background, foreground Colors
	* Links Display Characteristics (Color, underlining)
	* Home (Startup) Page Turn off, on, Specify a URL to load
	* Windows size and shape
	* Window Manipulation characteristics: Multiple windows, 
		iconized windows
	* Image Loading: Turn off, on automatically
	* Viewers: Indicate new programs to read special formatted data
	* Toolbar(s): Show all, none, customize yours
	* Bookmark List: Add, delete, edit links, also save to a file, 
		categorically organize them
	* The Cache: Choose: in memory (single session) or disk stored 
		(multisession), when to flush (the cache). [ Flush often 
		via browser or manually to save disk space]

*13. Hints for using URLs
* Don't change upper and lower case characters within URLs (Hostnames are 
case insensitive, but pathnames are case sensitive)
* Transcribe URLs carefully: use copy and paste whenever possible
* Recognize Common host name patterns: www for Web Servers, ftp for anonymous 
ftp servers and gopher for gopher servers
* You can guess the URL by typing the host name only and hope for a home page
* If the URL doesn't work, shorten the name from the right to the slash until 
it does.
* On some browsers, http:// is assumed, so you can just say: www.host.com
* Type the entire URL when sending a URL by mail 
* Files whose names end with .html or .htm are hypertext files

*14. Java: A Language For Distributed Applications
* Java: an object oriented programming language ("C++ without the pointers")
* Language applets are Computer System independent
* The idea: A Web page containing a very small program (applet) that your 
browser downloads and executes dynamically while your web page is displayed
* Examples include animation, rotatable graphics, games, calculation, and 
other special effects
* History:  - Developed by James Gosling (Sun) in 1992 for Consumer Computer 
products (PDAs)
	    - HotJava Browser developed in 1993
	    - Official Announcement for Java Technology May 1995 
	    - Adopted by NetScape (2.0) and Microsoft Explorer (To be) 
* Features: - Highly Interactive, 
	    - Uses speed of your local computer, not the network or modem
	    - Both an Interpreted (at runtime on Client browser "Viewer") 
		and Compiled language (on originating system)
	    - Architecture Neutral byte code (Computer System independent)
	    - Portable
	    - Multithreaded
	    - Extensible
	    - Dynamic data types (links specialized Java Application to 			server which supports the new data type)
	    - Dynamic Protocols (Seamless integration of new protocolsi
		with existing ones like electronic commerce)
	    - Security-oriented: (1) No inherent semantics for altering a 
		computer environment; (2) restricted access to system level 
		services, (3) UNIX directories accessed only: 
		/tmp/hotjava and ~/.hotjava via Environment variables 
		(4) discerns applets inside and outside firewalls, 
		(5) discerns socket and file manipulation by applet. 
		(6) Digital Signature assignment to classes verification 
		available before loading.

6. Lynx: A text based Web browser
	* No complex data types means faster downloading
	* Operating Lynx remotely in a UNIX Shell means faster downloading
	* Lynx permits you to concentrate on the information
	* Lynx can open frequently referenced files that were downloaded and 
		saved locally

7. Lynx Command Syntax (a .lynxrc file may be used to customize lynx)
$ lynx [options] [<URL> or pathname ]	# lynx file acts like a UNIX browser
Options are:
-anonymous	used to specify the anonymous account
-auth=id:pw	authentication information for protected forms
-case		enable case sensitive user searching
-cache=NUMBER	NUMBER of documents cached in memory. (default is 10)
-cfg=FILENAME	specifies a lynx.cfg file other than the default.
-display=DISPLAY	set the display variable for X "exec"ed programs
-dump		dump the first file to stdout and exit
-editor=EDITOR	enable edit mode with specified editor
-emacskeys	enable emacs-like key movement
-error_file=file	write the HTTP status code here
-fileversions	include all versions of files in local VMS directory listings
-force_html	forces the first document to be interpreted as HTML
-ftp		disable ftp access
-get_data	User data for get forms, read from stdin,terminated by 
			'---' on a line
-help		print this usage message
-homepage=URL	set homepage separate from start page
-index=URL	set the default index file to URL
-localhost	disable URLs that point to remote hosts
-mime_header	Show full mime header
-nobrowse	disable directory browsing
-noprint	disable print functions
-noredir	Don't follow Location: redirection
-nostatus	disable the miscellaneous information messages
-post_data	User data for post forms, read from stdin, terminated by 
			'---' on a line
-print		enable print functions (DEFAULT)
-restrictions[=options]	use -restrictions to see list below
-rlogin		disable rlogins
-selective	require .www_browsable files to browse directories
-show_cursor	don't hide the curser in the lower right corner
-source		dump the source of the first file to stdout and exit
-telnet		disable telnets
-term=TERM	set terminal type to TERM
-trace		turns on WWW trace mode
-version	print Lynx version information
-vikeys		enable vi-like key movement

USAGE: lynx -restrictions=[option][,option][,option]
	List of Options:
	all	restricts all options.
	default	same as commandline option -anonymous.  Disables default 
		services for anonymous users.  Currently set to, all restricted
		except for: inside_telnet, outside_telnet, inside_news, 
		inside_ftp, outside_ftp, inside_rlogin, outside_rlogin, goto, 
		jump and mail.  Defaults settable within userdefs.h
	bookmark	disallow changing the location of the bookmark file.
	bookmark_exec	disallow execution links via the bookmark file
	disk_save	disallow saving binary files to disk in the download 
				menu
	download	disallow downloaders in the download menu
	editor		disallow editing
	exec		disable execution scripts
	exec_frozen	disallow the user from changing the execution link
	file_url	disallow using G)oto to go to file: URL's
	goto		disable the 'g' (goto) command
	inside_ftp	disallow ftps for people coming from inside your domain
	inside_news	disallow USENET news posting for people coming from 
				inside your domain
	inside_rlogin	disallow rlogins for people coming from inside your 
				domain
	inside_telnet	disallow telnets for people coming from inside your 
				domain
	jump		disable the 'j' (jump) command
	mail		disallow mail
	news_post	disallow USENET News posting setting in the 
				O)ptions menu
	option_save	disallow saving options in .lynxrc
	outside_ftp	disallow ftps for people coming from outside your domain
	outside_news	disallow USENET newsposting for those coming from 
				beyond your domain
	outside_rlogin	disallow rlogins for people coming from outside your 
				domain
	outside_telnet	disallow telnets for people coming from outside your 
				domain
	print		disallow most print options
	shell		isallow shell escapes, lynxexec, and lynxcgi G)oto's
	suspend	disallow Control-Z suspends with escape to shell
	telnet_port	disallow specifying a port in telnet G)oto's

$ lynx -version
Lynx Version 2-4-2
(c)1995 University of Kansas
<lynx-help@ukanaix.cc.ukans.edu>

8. LYNX Cursor Commands
MOVEMENT:    
	Down arrow	- Highlight next topic                           
	Up arrow		- Highlight previous topic   
	Left arrow	- Return to previous topic                    
	Right arrow	- Jump to highlighted topic                    
	Return, Enter	- Launch Selection

SCROLLING:   
	+ (or space)  	 - Scroll down to next page                     
	- (or b)     	 - Scroll up to previous page                   
                                                                             
OTHER:
	? (or H)	- Help (this screen)                           
	a	- Add the current link to your bookmark file   
	c	- Send a comment to the document owner         
	d	- Download the current link                   
	e	- Edit the current file                      
	g	- Goto a user specified URL or file.          
	i	- Show an index of documents                 
	m	- Return to main screen                    
	o	- Set your options                            
	p	- Print to a file, mail, printers, or other entity
	q	- Quit  (Asks if you're sure?) 
	Q	- Quit (quick quit)   
	^D	- Quit Lynx (quick quit)
	/	- Search for a string within current document (be at top of 
			document)
	s	- Enter a search string for an external search.
	n	- Go to the next search string                 
	v	- View your bookmark file                    
	z	- Cancel transfer in progress                 
   [backspace]	- Go to the history page                      
	=	- Show file and link info                      
	\	- Toggle document (HTML) source/rendered view         
	!	- Spawn your default shell                     
	^R	- Reload current file and refresh the screen   
	^W	- Refresh the screen                           
	^U	- Erase input line                             
	^G	- Cancel input or transfer                             

Questions? Robert Katz: katz@ned.highline.edu
Last Update December 7, 1999