KEMBAR78
Distributed web based systems | PPTX
Distributed Web-Based SystemsDepartment of Engineering – Information Technology Reza Ghanbari2010
OutlineWWWURLWebDocumentsHTTPConnectionsMethodsMessagesCachingContent Distribution NetworkWeb ServiceTerminologyArchitectureTraditional Web Based SystemsMulti-tiered Web Based SystemsWeb Server ClustersWeb SecuritySSLReferences
World Wide WebIt is a wide distributed system with millions of clients and servers for accessing linked documents.
Servers maintain collections of documents while clients provide users an easy-to-use interface for presenting and accessing those documents.
A document is fetched from a server, transferred to a client, and presented on the screen.
There is conceptually no difference between a document stored locally or in another part of the world for any user.
Now, Web has become more than just a simple document based system.
With the emergence of Web Services, it is becoming a system of distributed services rather than just documents offered to any user or machine.Uniform Resource LocatorA reference called Uniform Resource Locator (URL) is used to refer a document.
The DNS name of its associated server along with a file name is specified.
Example:
http://www.example.sharif.edu/notes/WebBasedDistributedSystem.pptWEB DOCUMENTSA Web document does not only contain text, but it can include all kinds of dynamic features such as audio, video, animations, etc.
In many cases special helper applications (interpreters) are needed, and they are integrated into the browser.
The main part of Web documents are written in a markup language, such as
HyperText Markup Language (HTML) and
eXtensible Markup Language (XML)WEB DOCUMENTSHTML and XML can include tags that refer to embedded documents, which are references to other files.
An embedded document can be a complete program executed on-the-fly as part of displaying information.
Multipurpose Internet Mail Exchange (MIME) is used to specify the type of an embedded document.
MIME was originally developed to provide information on the content of e-mail messages.WEB DOCUMENTSSix top-level Multipurpose Internet Mail Exchange types and some common subtypes.
HTTPAll communication between the clients and servers is based on the HTTP. Servers listen on port 80.
HTTP is a simple protocol; a client sends a request to a server and waits for a response.
HTTP is based on TCP; whenever a client issues a request to a server, it first sets up a TCP connection and sends the message on that connection. The same connection is used for receiving the response.
One of the problems with the first versions of HTTP was its inefficient use of TCP connections.
HTTP 1.0 vs. HTTP 1.1HTTP CONNECTIONSA Web document is constructed from a collection of different files from the same server.
In HTTP version 1.0 and older, each request to a server required setting up a separate connection. When server had responded the connection was broken down. These connections are referred as non-persistent.
In HTTP version 1.1, several requests and their responses can be issued without the need for a separate connection. These connections are referred as persistent.
Furthermore, a client can issue several requests in a row without waiting for the response to the first request which is referred as pipelining.HTTP CONNECTIONS(a) Using non-persistent connections. (b) Using persistent connections.
HTTP Operations
HTTP MESSAGES (Request)
HTTP MESSAGES (Response)Status code (Phrase): 200 (OK), 400 (Bad Request), 403 (Forbidden), and 404 (Not Found).
HTTP MESSAGES (Response)There are also various message headers that the client can send to the server explaining what it is able to accept as a responseHTTP MESSAGES (Response)
HTTP CachingClients often cache documentsChallenge: update of documentsIf-Modified-Since requests to checkHTTP 0.9/1.0 used just dateHTTP 1.1 has an opaque “entity tag” (could be a file signature, etc.) as wellWhen/how often should the original be checked for changes?Check every time?Check each session? Day? Etc?Use “Expires” headerIf no Expires, often use Last-Modified as estimate16
Example Cache Check RequestGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateIf-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMTIf-None-Match: "7a11f-10ed-3a75ae4a"User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.intel-iris.netConnection: Keep-Alive17
Example Cache Check ResponseHTTP/1.1 304 Not ModifiedDate: Tue, 27 Mar 2001 03:50:51 GMTServer: Apache/1.3.14 (Unix)  (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24Connection: Keep-AliveKeep-Alive: timeout=15, max=100ETag: "7a11f-10ed-3a75ae4a"18
ProblemsOver 50% of all HTTP objects are un-cacheable .Not easily solvableDynamic data : stock prices, scores, web camsCGI scripts : results based on passed parametersSSL : encrypted data is not cacheableMost web clients don’t handle mixed pages well : many generic objects transferred with SSLCookies : results may be based on passed dataHit metering : owner wants to measure # of hits for revenue, etc.19
Server SelectionLowest load : to balance load on serversBest performance : to improve client performanceAny alive node : to provide fault toleranceHow to direct clients to a specific server?Cluster load balancing : TCP hand-offAs part of application : HTTP redirectAs part of naming : DNS20
Application-Based RedirectionHTTP supports simple way to indicate that Web page has moved (30X responses)Server receives Get request from clientDecides which server is best suited for particular client and objectReturns HTTP redirect to that serverMay introduce additional overhead :multiple connection setup, name lookups, etc.21
Naming BasedClient does name lookup for serviceName server chooses appropriate server addressA record returned is “best” one for the clientName server could base decision onServer load/location must be collectedInformation in the name lookup requestName service client : typically the local name server for client22
Web Proxy Caches23origin serverProxyserverHTTP requestHTTP requestclientHTTP responseHTTP responseHTTP requestHTTP responseorigin serverclientUser configures browser: Web accesses via  cacheBrowser sends all HTTP requests to cacheObject in cache: cache returns object Else cache requests object from origin server, then returns object to client
24Content Distribution Networks (CDNs)origin server in North AmericaThe content providers are the CDN customers.Content replicationCDN company installs hundreds of CDN servers throughout InternetClose to usersCDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates serversCDN distribution nodeCDN serverIn U.S.ACDN serverin AsiaCDN serverin Europe
Content Distribution NetworksReplicate content on many serversThe general organization of a CDN as a feedback-control system25
Web ServiceWeb Service: “software that makes services available on a network using technologies such as XML and HTTP”Service-Oriented Architecture (SOA): “development of applications from distributed collections  of smaller loosely coupled service providers”26
Web Services TerminologySOAPSimple Object Access Protocol exchanging XML messages on a networkWSDLWeb Service Description Language describing interfaces of Web servicesUDDI Universal Description, Discovery and Integrationmanaging registries of Web services27
Web Services Framework28
Why a New Framework?CORBA, DCOM, Java/RMI, ... already existXML+HTTP: platform/language neutral, widely accepted and utilized 		 Web service interoperability 29
Servlets/CGI vs. Web ServicesBrowserBrowserGUIClientWebServerHTTP GET/POSTWSDLSOAPWebServerWSDLWebServerWSDLWSDLSOAPJDBCJDBCDB DB30

Distributed web based systems

  • 1.
    Distributed Web-Based SystemsDepartmentof Engineering – Information Technology Reza Ghanbari2010
  • 2.
    OutlineWWWURLWebDocumentsHTTPConnectionsMethodsMessagesCachingContent Distribution NetworkWebServiceTerminologyArchitectureTraditional Web Based SystemsMulti-tiered Web Based SystemsWeb Server ClustersWeb SecuritySSLReferences
  • 3.
    World Wide WebItis a wide distributed system with millions of clients and servers for accessing linked documents.
  • 4.
    Servers maintain collectionsof documents while clients provide users an easy-to-use interface for presenting and accessing those documents.
  • 5.
    A document isfetched from a server, transferred to a client, and presented on the screen.
  • 6.
    There is conceptuallyno difference between a document stored locally or in another part of the world for any user.
  • 7.
    Now, Web hasbecome more than just a simple document based system.
  • 8.
    With the emergenceof Web Services, it is becoming a system of distributed services rather than just documents offered to any user or machine.Uniform Resource LocatorA reference called Uniform Resource Locator (URL) is used to refer a document.
  • 9.
    The DNS nameof its associated server along with a file name is specified.
  • 10.
  • 11.
    http://www.example.sharif.edu/notes/WebBasedDistributedSystem.pptWEB DOCUMENTSA Webdocument does not only contain text, but it can include all kinds of dynamic features such as audio, video, animations, etc.
  • 12.
    In many casesspecial helper applications (interpreters) are needed, and they are integrated into the browser.
  • 13.
    The main partof Web documents are written in a markup language, such as
  • 14.
  • 15.
    eXtensible Markup Language(XML)WEB DOCUMENTSHTML and XML can include tags that refer to embedded documents, which are references to other files.
  • 16.
    An embedded documentcan be a complete program executed on-the-fly as part of displaying information.
  • 17.
    Multipurpose Internet MailExchange (MIME) is used to specify the type of an embedded document.
  • 18.
    MIME was originallydeveloped to provide information on the content of e-mail messages.WEB DOCUMENTSSix top-level Multipurpose Internet Mail Exchange types and some common subtypes.
  • 19.
    HTTPAll communication betweenthe clients and servers is based on the HTTP. Servers listen on port 80.
  • 20.
    HTTP is asimple protocol; a client sends a request to a server and waits for a response.
  • 21.
    HTTP is basedon TCP; whenever a client issues a request to a server, it first sets up a TCP connection and sends the message on that connection. The same connection is used for receiving the response.
  • 22.
    One of theproblems with the first versions of HTTP was its inefficient use of TCP connections.
  • 23.
    HTTP 1.0 vs.HTTP 1.1HTTP CONNECTIONSA Web document is constructed from a collection of different files from the same server.
  • 24.
    In HTTP version1.0 and older, each request to a server required setting up a separate connection. When server had responded the connection was broken down. These connections are referred as non-persistent.
  • 25.
    In HTTP version1.1, several requests and their responses can be issued without the need for a separate connection. These connections are referred as persistent.
  • 26.
    Furthermore, a clientcan issue several requests in a row without waiting for the response to the first request which is referred as pipelining.HTTP CONNECTIONS(a) Using non-persistent connections. (b) Using persistent connections.
  • 27.
  • 28.
  • 29.
    HTTP MESSAGES (Response)Statuscode (Phrase): 200 (OK), 400 (Bad Request), 403 (Forbidden), and 404 (Not Found).
  • 30.
    HTTP MESSAGES (Response)Thereare also various message headers that the client can send to the server explaining what it is able to accept as a responseHTTP MESSAGES (Response)
  • 31.
    HTTP CachingClients oftencache documentsChallenge: update of documentsIf-Modified-Since requests to checkHTTP 0.9/1.0 used just dateHTTP 1.1 has an opaque “entity tag” (could be a file signature, etc.) as wellWhen/how often should the original be checked for changes?Check every time?Check each session? Day? Etc?Use “Expires” headerIf no Expires, often use Last-Modified as estimate16
  • 32.
    Example Cache CheckRequestGET / HTTP/1.1Accept: */*Accept-Language: en-usAccept-Encoding: gzip, deflateIf-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMTIf-None-Match: "7a11f-10ed-3a75ae4a"User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)Host: www.intel-iris.netConnection: Keep-Alive17
  • 33.
    Example Cache CheckResponseHTTP/1.1 304 Not ModifiedDate: Tue, 27 Mar 2001 03:50:51 GMTServer: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24Connection: Keep-AliveKeep-Alive: timeout=15, max=100ETag: "7a11f-10ed-3a75ae4a"18
  • 34.
    ProblemsOver 50% ofall HTTP objects are un-cacheable .Not easily solvableDynamic data : stock prices, scores, web camsCGI scripts : results based on passed parametersSSL : encrypted data is not cacheableMost web clients don’t handle mixed pages well : many generic objects transferred with SSLCookies : results may be based on passed dataHit metering : owner wants to measure # of hits for revenue, etc.19
  • 35.
    Server SelectionLowest load: to balance load on serversBest performance : to improve client performanceAny alive node : to provide fault toleranceHow to direct clients to a specific server?Cluster load balancing : TCP hand-offAs part of application : HTTP redirectAs part of naming : DNS20
  • 36.
    Application-Based RedirectionHTTP supportssimple way to indicate that Web page has moved (30X responses)Server receives Get request from clientDecides which server is best suited for particular client and objectReturns HTTP redirect to that serverMay introduce additional overhead :multiple connection setup, name lookups, etc.21
  • 37.
    Naming BasedClient doesname lookup for serviceName server chooses appropriate server addressA record returned is “best” one for the clientName server could base decision onServer load/location must be collectedInformation in the name lookup requestName service client : typically the local name server for client22
  • 38.
    Web Proxy Caches23originserverProxyserverHTTP requestHTTP requestclientHTTP responseHTTP responseHTTP requestHTTP responseorigin serverclientUser configures browser: Web accesses via cacheBrowser sends all HTTP requests to cacheObject in cache: cache returns object Else cache requests object from origin server, then returns object to client
  • 39.
    24Content Distribution Networks(CDNs)origin server in North AmericaThe content providers are the CDN customers.Content replicationCDN company installs hundreds of CDN servers throughout InternetClose to usersCDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates serversCDN distribution nodeCDN serverIn U.S.ACDN serverin AsiaCDN serverin Europe
  • 40.
    Content Distribution NetworksReplicatecontent on many serversThe general organization of a CDN as a feedback-control system25
  • 41.
    Web ServiceWeb Service:“software that makes services available on a network using technologies such as XML and HTTP”Service-Oriented Architecture (SOA): “development of applications from distributed collections of smaller loosely coupled service providers”26
  • 42.
    Web Services TerminologySOAPSimpleObject Access Protocol exchanging XML messages on a networkWSDLWeb Service Description Language describing interfaces of Web servicesUDDI Universal Description, Discovery and Integrationmanaging registries of Web services27
  • 43.
  • 44.
    Why a NewFramework?CORBA, DCOM, Java/RMI, ... already existXML+HTTP: platform/language neutral, widely accepted and utilized  Web service interoperability 29
  • 45.
    Servlets/CGI vs. WebServicesBrowserBrowserGUIClientWebServerHTTP GET/POSTWSDLSOAPWebServerWSDLWebServerWSDLWSDLSOAPJDBCJDBCDB DB30
  • 46.
    TRADITIONAL WEB-BASED SYSTEMSManyWeb-based systems are still organized as simple client-server architectures.
  • 47.
    The core ofa Web site: a process that has access to a local file system storing documents.
  • 48.
    A client interactswith Web servers through a special application known as browser.
  • 49.
    What’s the keyfunction of a browser?
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
    locates and returnsthe object identified in the request.
  • 55.
    includes predefined HTMLpages and JPEG or GIF files.
  • 56.
    Web servers donot require communication with any server-side application.
  • 57.
  • 58.
    The request isforwarded to an application system where the resulting reply is generated dynamically. (server-side program execution)
  • 59.
    Although Web startedas simple two-tiered client-server architecture for static Web documents, this architecture has been extended to support advanced type of documents.33
  • 60.
    MULTITIERED ARCHITECTURESOne ofthe first enhancements is Common Gateway Interface (CGI): user data comes from an HTML form, specifying the program and parameters.34
  • 61.
    MULTITIERED ARCHITECTURESBecause ofthe server-side processing many Web sites are now organized as three-tiered architectures consisting of a Web server, an application server, and a database server.
  • 62.
    Server-side scripting technologiesare used to generate dynamic content:
  • 63.
  • 64.
    Sun: Java ServerPages (JSP)
  • 65.
  • 66.
  • 67.
    Most popular Webserver softwareApache. As of March 2007, 58% of all websites are using it.35
  • 68.
    WEB SERVER CLUSTERSWebservers are replicated and combined with a front endtoimprove performance.36
  • 69.
    WEB SERVER CLUSTERSThefront end can be designed in two ways:
  • 70.
  • 71.
    simply passes datasent along the TCP connection to one of the server’s, depending on some measurement of the server’s load.
  • 72.
  • 73.
    it first inspectsthe HTTP request and decides which server it should forward that request to.
  • 74.
    For example, ifthe front end always forwards requests for the same document to the same server, the server may cache the document resulting in better response times.37
  • 75.
    WEB SERVER CLUSTERSAscalable content-aware cluster of Web servers.38
  • 76.
    WEB SERVER CLUSTERSAnotheralternative to set up a Web Server Cluster is to use round-robin DNS
  • 77.
    a single domainname is associated with multiple IP addresses.
  • 78.
    When resolving ahost name, a browser would receive a list of multiple addresses, each address corresponding a server.
  • 79.
    Normally, browsers choosethe first address on the list, but most DNS servers circulate the entries.
  • 80.
    As a result,simple distribution of requests over the servers in the cluster is achieved.39
  • 81.
    Web Security IssuesThe Web has become the visible interface of the Internet
  • 82.
    Many corporations now use the Web for advertising, marketing and sales
  • 83.
    Web servers might be easy to use but
  • 84.
    Complicated to configure correctly and difficult to build without security flaws
  • 85.
    They can serve as a security hole by which an adversary might be able to access other data and computer systemsSecure the WebThere are many strategies to securing the webWe may attempt to secure the IP Layer of the TCP/IP Stack: This may be accomplished using IPSec, for example.We may leave IP alone and secure on top of TCP: This may be accomplished using the Secure Sockets Layer (SSL) or Transport Layer Security (TLS)We may seek to secure specific applications by using application-specific security solutions: For example, we may use Secure Electronic Transaction (SET)The first two provide generic solutions, while the third provides for more specialized services41
  • 86.
    Securing the TCP/IPStackHTTPFTPSMTPHTTPFTPSMTPSSL/TLSTCPTCPIP/IPSECIPAt the Network LevelAt the Transport LevelS/MIMEPGPSETKerberosSMTPHTTPTCPUDPIPAt the Application Level42
  • 87.
    Secure Sockets Layer(SSL)Originally developed (1994) by Netscape in order to secure http communicationsSlight variation became Transport Layer Security (TLS)backward compatible with SSLTCP provides a reliable end-to-end serviceConsists of two sublayers:SSL Record Protocol (where all the action takes place)SSL Management (Handshake/Cipher Change/ Alert Protocols) 43
  • 88.
  • 89.
    ReferencesDistributed Systems Principlesand Paradigms, by Maarten van Steen, VU Amsterdam, steen@cs.vu.nl
  • 90.
    Web Service Composition- Current Solutions and Open Problems, by BiplavSrivastava-IBM India Research Laboratory and Jana Koehler-IBM Zurich Research Laboratory
  • 91.
    A Reference Architecturefor Web Servers, by Ahmed E. Hassan and Richard C. Holt , Software Architecture Group (SWAG), University of Waterloo
  • 92.
    An Introduction toWeb-based Support Systems, by JingTao Yao, University of Regina
  • 93.
    Semantic Annotation forWeb Services and their elevance to Environmental Models, by DumitruRoman University of Innsbruck / STI Innsbruck 45

Editor's Notes

  • #29 This is done by ‘wrapping’ some computational capability with a Web Service interface, and allowing other organizations to locate it (via UDDI) and interact with it (via WSDL). Hence, Web Service technology allows the description of an interface in a standard way,