VIII. World Wide Web 1. World Wide Web Definition: an ambitious client-server system that offers a simple, consistent interface to vast resources of the Internet. * It is the name given to a large collection of internet accessible text, pictures and data, that can contain one or more links (highlighted and colored text or icons) to another data. * Such text, pictures, and data containing links is called hypertext. * The World Wide Web represents a network of links. * Originators: Vannevar Bush [45 Memex information machine] Ted Nelson [81 Xanadu], Apple [87 HyperCard ] Tim Berners-Lee, Robert Cailliau [89- CERN project] WWW announced on Usenet 5/91 [92 Lynx ] Marc Andreesen [93 X-Mosaic] and then PC Air Mosaic and MacMosaic * Client program (Browser) runs on your computer which contacts a remote Server program. * Now Browsers not only act as Web clients, but also gopher, ftp, Usenet, and mail clients 2. The Web Definition: An enormous Internet-based information utility encompassing not only Web servers offering hypertext data, but all the gopher servers on the net, all the anonymous ftp sites, all the Usenet discussion (news)groups and mail. 3. Note: The most important aspect of the Web is not the links that connect one data item to another, it is the content between the links. 4. Information Dissemination in Communication Systems: Disadvantage: Limited large group participation (response) *[Snail/Express] Mail [Discrete, one-one, different time different place] +Fax machines [Fast Discrete, one-one, different time different place] *Telephone [Discrete, one-one, same time different place] +Answering machines [Discrete, one-one, different time different place] *Radio [Continuous update, small group-many, same time different place] *Television [Continuous update, small group-many, same time different place] *The Newspaper [Daily or less Update, small group-many, different time and place] *Magazines [Weekly or less Update, small group-many, different time and place] *Books [Yearly or less Update, one-many, different time and place] Internet Benefits: * Potentially universal participation (interaction) * Nearly instantaneous support of one-one, one-many and many-many connections * Satisfies humanity's need (urge) to communicate *E-mail [Discrete, one-many, different time and place] *E-mail Mailing Lists [Updated, many-many, different time and place, -less accessible] *Usenet [Updated, many-many, different time and place, -limited mesg time] *Gopher [Updated, many-many, different time and place, menu based, + orgs] *The Web [Updated, many-many, different time and place, +way people think] * The Web is the 4th effort to connect mankind on the Internet. + It is an information system with words, ideas, pictures, with data type expansion + It has links to let you jump from one item to another + It has a user interface that is familiar and easy to use + It has client programs for PCs, Macintoshes used by the majority of people 5. Using the Web: *1. Procedure: * Invoke a Network connected browser (i.e. Mosaic, Netscape, Explorer, Lynx) * In the File menu, open a location (url) to view: Type its address e.g. http://www.yahoo.com * The browser fetches the information to your system and displays it on your screen * Items that are highlighted via underline (lynx) or in color are links to elsewhere * Use the arrow keys to move and <CR> to invoke (lynx) or mouse to move and click to invoke a link Subjectively, you are "going on tangents" or "surfing", "visiting" or "navigating" the web. Behaviorally, you are always either (1) Reading Text [20% est.] (2) Looking at pictures or animation [4% est.] (3) Issuing commands [1% est.] or (4) WAITING [75% est.] *2. Notes: *The Web is a mixture of two kinds of sites: (1) created by individuals about themselves (2) containing information about an organization(s). *When you move your mouse onto a highlighted link, your browser shows the URL associated witht the link * Web Page: The Contents of a single file containing hypertext links to other pages Web Site: A host containing one or more pages (related or not) Home Page: The main [menu] page for a Web Site; Also the starting page that can be loaded when your browser is launched * Most browsers let you see, copy and save hypertext source for the viewed web page. * Use a word processor to convert existing documents into HTML * Use an HTML editor with integrated browser to create new Web files. * Create a Web page with your Bookmark Lists so you can greater describe them. * On UNIX systems, HTML files, and their subdirectories should have permissions set to 704, 705 respectively: public read only public directory read and access only *3. Uniform Resource Locators (URLs) * Unique addressing for hypertext and other network items or services ( gophers, anonymous ftp sites, Usenet newsgroups, wais databases) * Format: protocol://hostname[:port]/path or protocol:description (used by news and mailto) Protocols Meaning afs File accessed via the Andrew File System cid Content identifier for a Mime body part message file Access to a [local] file ftp File accessed via [anonymous] ftp gopher Gopher resource http Hypertext Transfer Protocol resource mailserver Data accessed via a mail server mailto Mail a message to a specified address mid Identifier of a specific mail message news Usenet Newsgroup nfs File accessed via the Network File System nntp Usenet news for local NNTP access only prospero Resource accessed via a Prospero directory server rlogin Interactive rlogin session telnet Interactive telnet session tn3270 Interactive 3270 telnet session wais Access to a Wais database z39.50 Access to a database via a Z39.50-type query *4. Invoking a URL (1) Autoload it in advance when the browser is launched (2) Pull down a menu item to "open" a URL and type it in the dialog box. (3) Type the URL directly in the browser's box near the top of the window (4) Click on the builtin browser buttons assigned to specific resources to go there. (5) Choose an item from your bookmark list and/or history list *5. To stop a web page from loading (1) Press STOP button on the browser (2) Press the animated icon (upper right) on the browser to bring back the home page (3) Press the escape key <ESC> on your keyboard (lynx) (4) Select another URL and invoke it while the current one is loading *6. Types of Links * Highlighted words (Representing URLs) * Small graphical element like a button or picture (Representing URLs) * Interactive Form: you can enter information that is submitted to remote web site to be processed by an existing CGI Script (CGI = Common Gateway Interface) on the remote server. * Image Map: A picture (photo or drawing) menu where the various parts of the image act as separate links. Discrete: A collage of several small pictures Continuous: Clicked mouse pointer coordinates determine what is downloaded next *7. History List and Bookmarks * The History List is maintained for the current browser session, showing where you have been in reverse order. There are also buttons on the browser that help you go "back" or "Forward" (left and right arrows are possible). Successive "back" clicks take you in reverse towards your original home page. * The History List can be shown and an entry directly selected by mouse click. * The Bookmark List (Hot List) contains your saved Web addresses of all your sessions with this browser. - Your browser can save this Bookmark List to a file - Your browser can let you edit the Bookmark List to organize this into categories *8. Inline and External Images * An Inline Image is a picture that is part of a web page. It is usually a link to a file containing the full (external) image. * An External Image is a picture that is itself a web page, shown in its own window * Images are stored in formats like GIF and JPEG. To display these, your browser must invoke a helper application (Plug-in) that acts as a GIF or JPEG Viewer (Your Browser can be made aware of these in its configuration or preferences area) *9. Sounds and Video * Sound is stored in formats like AU (audio) and WAV (Waveform) * Video is stored in formats like MPEG (Motion Picture Experts Group), Quicktime and AVI (Audio/Visual Interleaved Data) *Regular sound files are downloaded, stored and then played; *Real-time sounds are played as they are downloaded but not retained. *10. Updating Information automatically * Client Pull: Your browser is told by the Web page to reload itself every n seconds This refreshes the information and requires repeated browser connection * Server Push: The Web server sends new data on its own to your browser This refreshes the information and the browser-server connection is kept open indefinitely. * Requirements: A Web page must specifically request client pull, a server must be designed to offer server push, your browser must know how to support these features. * Examples: Web based talk facilities, Financial Services Stock quotes *11. Web Directories and Search Engines * Web pages containing a list of other Web Sites by Category * Web pages containing a list of Search Engines invoked by filling in a form * Search Engines supply a list of links based on your keyword(s) in order of most closely matched links first. Search engines/tools have two components: collection and search. The collection (also known as automated robot, Wanderer, Spider, Harvest and Pursuit) part roams internet sites, mostly www, gopher and ftp sites, brings back resources, sorts, indexes and creates a database out of them. * Boolean Searches permit AND and OR operators on search item keywords * Examples: Web Directories Yahoo! http://www.yahoo.com Magellan http://www.mckinley.com Point http://www.pointcom.com W3 Servers http://www.w3.org/hypertext/DataSources/WWW/Servers.html Metroscope http://isotropic.com/metro/scope.html Yanoff http://www.uwm.edu/Mirror/inet.services.html * Search Engines: AltaVista http://www.altavista.digital.com Lycos: http://www.lycos.com WebCrawler: http://www.webcrawler.com Inktomi: http://inktomi.cs.berkeley.edu:1234/ DejaNews: http://www.dejanews.com/forms/dnq.html World Wide Web Worm: http://wwww.cs.colorado.edu/wwww/ excite NetSearch: http://www.excite.com/ * Searching Search Engines (Multiple Simultaneous Queries): SavvySearch: http://cage.cs.colostate.edu:1969/ IBM InfoMarket Search: http://www.infomkt.ibm.com/ MetaCrawler: http://metacrawler.cs.washington.edu:8080/index.html ProFusion: http://www.designlab.ukans.edu/ProFusion.html *12. Customizing your Browser environment by modifying: * Text size and typeface and background, foreground Colors * Links Display Characteristics (Color, underlining) * Home (Startup) Page Turn off, on, Specify a URL to load * Windows size and shape * Window Manipulation characteristics: Multiple windows, iconized windows * Image Loading: Turn off, on automatically * Viewers: Indicate new programs to read special formatted data * Toolbar(s): Show all, none, customize yours * Bookmark List: Add, delete, edit links, also save to a file, categorically organize them * The Cache: Choose: in memory (single session) or disk stored (multisession), when to flush (the cache). [ Flush often via browser or manually to save disk space] *13. Hints for using URLs * Don't change upper and lower case characters within URLs (Hostnames are case insensitive, but pathnames are case sensitive) * Transcribe URLs carefully: use copy and paste whenever possible * Recognize Common host name patterns: www for Web Servers, ftp for anonymous ftp servers and gopher for gopher servers * You can guess the URL by typing the host name only and hope for a home page * If the URL doesn't work, shorten the name from the right to the slash until it does. * On some browsers, http:// is assumed, so you can just say: www.host.com * Type the entire URL when sending a URL by mail * Files whose names end with .html or .htm are hypertext files *14. Java: A Language For Distributed Applications * Java: an object oriented programming language ("C++ without the pointers") * Language applets are Computer System independent * The idea: A Web page containing a very small program (applet) that your browser downloads and executes dynamically while your web page is displayed * Examples include animation, rotatable graphics, games, calculation, and other special effects * History: - Developed by James Gosling (Sun) in 1992 for Consumer Computer products (PDAs) - HotJava Browser developed in 1993 - Official Announcement for Java Technology May 1995 - Adopted by NetScape (2.0) and Microsoft Explorer (To be) * Features: - Highly Interactive, - Uses speed of your local computer, not the network or modem - Both an Interpreted (at runtime on Client browser "Viewer") and Compiled language (on originating system) - Architecture Neutral byte code (Computer System independent) - Portable - Multithreaded - Extensible - Dynamic data types (links specialized Java Application to server which supports the new data type) - Dynamic Protocols (Seamless integration of new protocolsi with existing ones like electronic commerce) - Security-oriented: (1) No inherent semantics for altering a computer environment; (2) restricted access to system level services, (3) UNIX directories accessed only: /tmp/hotjava and ~/.hotjava via Environment variables (4) discerns applets inside and outside firewalls, (5) discerns socket and file manipulation by applet. (6) Digital Signature assignment to classes verification available before loading. 6. Lynx: A text based Web browser * No complex data types means faster downloading * Operating Lynx remotely in a UNIX Shell means faster downloading * Lynx permits you to concentrate on the information * Lynx can open frequently referenced files that were downloaded and saved locally 7. Lynx Command Syntax (a .lynxrc file may be used to customize lynx) $ lynx [options] [<URL> or pathname ] # lynx file acts like a UNIX browser Options are: -anonymous used to specify the anonymous account -auth=id:pw authentication information for protected forms -case enable case sensitive user searching -cache=NUMBER NUMBER of documents cached in memory. (default is 10) -cfg=FILENAME specifies a lynx.cfg file other than the default. -display=DISPLAY set the display variable for X "exec"ed programs -dump dump the first file to stdout and exit -editor=EDITOR enable edit mode with specified editor -emacskeys enable emacs-like key movement -error_file=file write the HTTP status code here -fileversions include all versions of files in local VMS directory listings -force_html forces the first document to be interpreted as HTML -ftp disable ftp access -get_data User data for get forms, read from stdin,terminated by '---' on a line -help print this usage message -homepage=URL set homepage separate from start page -index=URL set the default index file to URL -localhost disable URLs that point to remote hosts -mime_header Show full mime header -nobrowse disable directory browsing -noprint disable print functions -noredir Don't follow Location: redirection -nostatus disable the miscellaneous information messages -post_data User data for post forms, read from stdin, terminated by '---' on a line -print enable print functions (DEFAULT) -restrictions[=options] use -restrictions to see list below -rlogin disable rlogins -selective require .www_browsable files to browse directories -show_cursor don't hide the curser in the lower right corner -source dump the source of the first file to stdout and exit -telnet disable telnets -term=TERM set terminal type to TERM -trace turns on WWW trace mode -version print Lynx version information -vikeys enable vi-like key movement USAGE: lynx -restrictions=[option][,option][,option] List of Options: all restricts all options. default same as commandline option -anonymous. Disables default services for anonymous users. Currently set to, all restricted except for: inside_telnet, outside_telnet, inside_news, inside_ftp, outside_ftp, inside_rlogin, outside_rlogin, goto, jump and mail. Defaults settable within userdefs.h bookmark disallow changing the location of the bookmark file. bookmark_exec disallow execution links via the bookmark file disk_save disallow saving binary files to disk in the download menu download disallow downloaders in the download menu editor disallow editing exec disable execution scripts exec_frozen disallow the user from changing the execution link file_url disallow using G)oto to go to file: URL's goto disable the 'g' (goto) command inside_ftp disallow ftps for people coming from inside your domain inside_news disallow USENET news posting for people coming from inside your domain inside_rlogin disallow rlogins for people coming from inside your domain inside_telnet disallow telnets for people coming from inside your domain jump disable the 'j' (jump) command mail disallow mail news_post disallow USENET News posting setting in the O)ptions menu option_save disallow saving options in .lynxrc outside_ftp disallow ftps for people coming from outside your domain outside_news disallow USENET newsposting for those coming from beyond your domain outside_rlogin disallow rlogins for people coming from outside your domain outside_telnet disallow telnets for people coming from outside your domain print disallow most print options shell isallow shell escapes, lynxexec, and lynxcgi G)oto's suspend disallow Control-Z suspends with escape to shell telnet_port disallow specifying a port in telnet G)oto's $ lynx -version Lynx Version 2-4-2 (c)1995 University of Kansas <lynx-help@ukanaix.cc.ukans.edu> 8. LYNX Cursor Commands MOVEMENT: Down arrow - Highlight next topic Up arrow - Highlight previous topic Left arrow - Return to previous topic Right arrow - Jump to highlighted topic Return, Enter - Launch Selection SCROLLING: + (or space) - Scroll down to next page - (or b) - Scroll up to previous page OTHER: ? (or H) - Help (this screen) a - Add the current link to your bookmark file c - Send a comment to the document owner d - Download the current link e - Edit the current file g - Goto a user specified URL or file. i - Show an index of documents m - Return to main screen o - Set your options p - Print to a file, mail, printers, or other entity q - Quit (Asks if you're sure?) Q - Quit (quick quit) ^D - Quit Lynx (quick quit) / - Search for a string within current document (be at top of document) s - Enter a search string for an external search. n - Go to the next search string v - View your bookmark file z - Cancel transfer in progress [backspace] - Go to the history page = - Show file and link info \ - Toggle document (HTML) source/rendered view ! - Spawn your default shell ^R - Reload current file and refresh the screen ^W - Refresh the screen ^U - Erase input line ^G - Cancel input or transferQuestions? Robert Katz: katz@ned.highline.edu