Oversigt
Komponentbaseret programmering er blevet mere populært end nogensinde. Der bliver næsten ikke bygget et program i dag, som ikke involverer komponenter i en eller anden form, som regel fra forskellige leverandører. Efterhånden som programmerne er blevet mere sofistikerede, er behovet for at udnytte komponenter, der er distribueret på eksterne maskiner, også vokset.
Et eksempel på en komponentbaseret applikation er en end-to-end e-handelsløsning. En e-handelsapplikation, der ligger på en webfarm, skal sende ordrer til en back-end Enterprise Resource Planning (ERP)-applikation. I mange tilfælde ligger ERP-applikationen på en anden hardware og kører måske på et andet operativsystem.
Microsofts Distributed Component Object Model (DCOM), en infrastruktur for distribuerede objekter, der gør det muligt for et program at kalde Component Object Model (COM)-komponenter, der er installeret på en anden server, er blevet overført til en række ikke-Windows-platforme. Men DCOM har aldrig vundet bred accept på disse platforme, så det bruges sjældent til at lette kommunikationen mellem Windows- og ikke-Windows-computere. ERP-softwareleverandører skaber ofte komponenter til Windows-platformen, som kommunikerer med back-end-systemet via en proprietær protokol.
Nogle af de tjenester, som en e-handelsapplikation benytter sig af, ligger måske slet ikke i datacentret. Hvis e-handelsapplikationen f.eks. accepterer kreditkortbetaling for varer, som kunden har købt, skal den bruge den handlende banks tjenester til at behandle kundens kreditkortoplysninger. Men i praksis er DCOM og beslægtede teknologier som CORBA og Java RMI begrænset til applikationer og komponenter, der er installeret i virksomhedens datacenter. To primære årsager til dette er, at disse teknologier som standard anvender proprietære protokoller, og at disse protokoller i sagens natur er forbindelsesorienterede.
Klienter, der kommunikerer med serveren via internettet, står over for mange potentielle barrierer for at kommunikere med serveren. Sikkerhedsbevidste netværksadministratorer over hele verden har implementeret virksomhedsroutere og firewalls for at afvise praktisk talt enhver form for kommunikation over internettet. Det kræver ofte en Guds handling at få en netværksadministrator til at åbne porte ud over det absolutte minimum.
Hvis du er heldig nok til at få en netværksadministrator til at åbne de relevante porte til at understøtte din tjeneste, er der gode chancer for, at dine klienter ikke er lige så heldige. Derfor er proprietære protokoller som dem, der bruges af DCOM, CORBA og Java RMI, ikke praktiske i internetscenarier.
Det andet problem med disse teknologier er som sagt, at de i sagens natur er forbindelsesorienterede og derfor ikke kan håndtere netværksafbrydelser på en elegant måde. Fordi internettet ikke er under din direkte kontrol, kan du ikke gøre nogen antagelser om forbindelsens kvalitet eller pålidelighed. Hvis der opstår en netværksafbrydelse, kan det næste opkald, klienten foretager til serveren, mislykkes.
Den forbindelsesorienterede karakter af disse teknologier gør det også udfordrende at opbygge de belastningsbalancerede infrastrukturer, der er nødvendige for at opnå høj skalerbarhed. Når forbindelsen mellem klienten og serveren er afbrudt, kan man ikke bare sende den næste anmodning videre til en anden server.
Udviklere har forsøgt at overvinde disse begrænsninger ved at udnytte en model, der hedder tilstandsløs Programmering, men de har haft begrænset succes, fordi teknologierne er ret tunge og gør det dyrt at genoprette en forbindelse med et fjerntliggende objekt.
Da behandlingen af en kundes kreditkort udføres af en fjernserver på internettet, er DCOM ikke ideel til at lette kommunikationen mellem e-handelsklienten og kreditkortsbehandlingsserveren. Ligesom i en ERP-løsning installeres der ofte en tredjepartskomponent i klientens datacenter (i dette tilfælde af udbyderen af kreditkortbehandlingen). Denne komponent fungerer som lidt mere end en proxy, der letter kommunikationen mellem e-handelssoftwaren og den handlende bank via en proprietær protokol.
Kan du se et mønster her? På grund af de eksisterende teknologiers begrænsninger, når det gælder kommunikation mellem computersystemer, har softwareleverandørerne ofte været nødt til at bygge deres egen infrastruktur. Det betyder, at ressourcer, der kunne have været brugt til at tilføje forbedret funktionalitet til ERP-systemet eller kreditkortsystemet, i stedet er blevet brugt til at skrive proprietære netværksprotokoller.
I et forsøg på bedre at understøtte sådanne internetscenarier valgte Microsoft oprindeligt at udvide sine eksisterende teknologier, herunder COM Internet Services (CIS), som gør det muligt at etablere en DCOM-forbindelse mellem klienten og fjernkomponenten via port 80. Af forskellige årsager blev CIS ikke bredt accepteret.
Det blev klart, at der var brug for en ny tilgang. Så Microsoft besluttede at løse problemet nedefra og op. Lad os se på nogle af de krav, løsningen skulle opfylde for at blive en succes.
- Interoperabilitet Fjerntjenesten skal kunne bruges af klienter på andre platforme.
- Internet-venlighed Løsningen bør fungere godt til at understøtte klienter, der får adgang til fjerntjenesten fra internettet.
- Stærkt typede grænseflader Der bør ikke være nogen tvetydighed om typen af data, der sendes til og modtages fra en fjerntjeneste. Desuden bør datatyper, der er defineret af fjernservicen, passe rimeligt godt til datatyper, der er defineret af de fleste procedurale programmeringssprog.
- Evne til at udnytte eksisterende internetstandarder Implementeringen af fjernservicen bør udnytte eksisterende internetstandarder så meget som muligt og undgå at genopfinde løsninger på problemer, der allerede er løst. En løsning, der bygger på bredt vedtagne internetstandarder, kan udnytte eksisterende værktøjssæt og produkter, der er skabt til teknologien.
- Understøttelse af alle sprog Løsningen bør ikke være tæt koblet til et bestemt programmeringssprog. Java RMI er f.eks. tæt koblet til Java-sproget. Det ville være svært at påkalde funktionalitet på et eksternt Java-objekt fra Visual Basic eller Perl. En klient skal kunne implementere en ny webtjeneste eller bruge en eksisterende webtjeneste, uanset hvilket programmeringssprog klienten er skrevet i.
- Understøttelse af enhver distribueret komponentinfrastruktur Løsningen bør ikke være tæt koblet til en bestemt komponentinfrastruktur. Faktisk bør det ikke være nødvendigt at købe, installere eller vedligeholde en infrastruktur for distribuerede objekter, bare for at bygge en ny fjernservice eller bruge en eksisterende service. De underliggende protokoller skal muliggøre et grundlæggende kommunikationsniveau mellem eksisterende infrastrukturer for distribuerede objekter som DCOM og CORBA.
I betragtning af titlen på denne bog bør det ikke komme som nogen overraskelse, at den løsning, Microsoft skabte, er kendt som Webtjenester. En webtjeneste udstiller en grænseflade til at påkalde en bestemt aktivitet på vegne af klienten. En klient kan få adgang til webservicen gennem brug af internetstandarder.
Byggeklodser til webservices
Følgende grafik viser de centrale byggesten, der er nødvendige for at muliggøre fjernkommunikation mellem to applikationer.
Lad os diskutere formålet med hver af disse byggesten. Da mange læsere er fortrolige med DCOM, vil jeg også nævne DCOM-ækvivalenten for hver byggesten.
- Opdagelse Den klientapplikation, der har brug for adgang til en webtjenestes funktionalitet, skal kunne finde frem til fjerntjenestens placering. Dette opnås gennem en proces, der generelt kaldes opdagelse. Opdagelse kan gøres lettere via en central mappe såvel som ved mere ad hoc-metoder. I DCOM leverer Service Control Manager (SCM) opdagelsestjenester.
- Beskrivelse Når slutpunktet for en bestemt webtjeneste er fundet, har klienten brug for tilstrækkelig information til at kunne interagere korrekt med den. Beskrivelsen af en webtjeneste omfatter strukturerede metadata om den grænseflade, der er beregnet til at blive brugt af en klientapplikation, samt skriftlig dokumentation om webtjenesten, herunder eksempler på brug. En DCOM-komponent udstiller strukturerede metadata om sine grænseflader via et typebibliotek (typelib). Metadataene i en komponents typelib gemmes i et proprietært binært format og tilgås via en proprietær programmeringsgrænseflade (API).
- Beskedens format For at kunne udveksle data skal en klient og en server blive enige om en fælles måde at kode og formatere meddelelserne på. En standardiseret måde at kode data på sikrer, at data, der er kodet af klienten, bliver fortolket korrekt af serveren. I DCOM formateres meddelelser, der sendes mellem en klient og en server, som defineret i DCOM Object RPC (ORPC)-protokollen.
Uden en standardiseret måde at formatere meddelelserne på er det næsten umuligt at udvikle et værktøjssæt, der abstraherer udvikleren fra de underliggende protokoller. Ved at skabe et abstraktionslag mellem udvikleren og de underliggende protokoller kan udvikleren fokusere mere på det aktuelle forretningsproblem og mindre på den infrastruktur, der kræves for at implementere løsningen.
- Kodning De data, der sendes mellem klienten og serveren, skal kodes i meddelelsens brødtekst. DCOM bruger et binært kodningsskema til at serialisere de data, der er indeholdt i de parametre, der udveksles mellem klienten og serveren.
- Transport Når meddelelsen er blevet formateret, og dataene er blevet serialiseret i meddelelsens brødtekst, skal meddelelsen overføres mellem klienten og serveren via en transportprotokol. DCOM understøtter en række proprietære protokoller, der er bundet til en række netværksprotokoller som TCP, SPX, NetBEUI og NetBIOS over IPX.
Beslutninger om design af webtjenester
Lad os diskutere nogle af de designbeslutninger, der ligger bag disse byggesten til webtjenester.
Choosing Transport Protocols
The first step was to determine how the client and the server would communicate with each other. The client and the server can reside on the same LAN, but the client might potentially communicate with the server over the Internet. Therefore, the transport protocol must be equally suited to LAN environments and the Internet.
As I mentioned earlier, technologies such as DCOM, CORBA, and Java RMI are ill suited for supporting communication between the client and the server over the Internet. Protocols such as Hypertext Transfer Protocol (HTTP) and Simple Mail Transfer Protocol (SMTP) are proven Internet protocols. HTTP defines a request/response messaging pattern for submitting a request and getting an associated response. SMTP defines a routable messaging protocol for asynchronous communication. Let’s examine why HTTP and SMTP are well suited for the Internet.
HTTP-based Web applications are inherently stateless. They do not rely on a continuous connection between the client and the server. This makes HTTP an ideal protocol for high-availability configurations such as firewalls. If the server that handled the client’s original request becomes unavailable, subsequent requests can be automatically routed to another server without the client knowing or caring.
Almost all companies have an infrastructure in place that supports SMTP. SMTP is well suited for asynchronous communication. If service is disrupted, the e-mail infrastructure automatically handles retries. Unlike with HTTP, you can pass SMTP messages to a local mail server that will attempt to deliver the mail message on your behalf.
The other significant advantage of both HTTP and SMTP is their pervasiveness. Employees have come to rely on both e-mail and their Web browsers, and network administrators have a high comfort level supporting these services. Technologies such as network address translation (NAT) and proxy servers provide a way to access the Internet via HTTP from within otherwise isolated corporate LANs. Administrators will often expose an SMTP server that resides inside the firewall. Messages posted to this server will then be routed to their final destination via the Internet.
In the case of credit card processing software, an immediate response is needed from the merchant bank to determine whether the order should be submitted to the ERP system. HTTP, with its request/response message pattern, is well suited to this task.
Most ERP software packages are not capable of handling large volumes of orders that can potentially be driven from the e-commerce application. In addition, it is not imperative that the orders be submitted to the ERP system in real time. Therefore, SMTP can be leveraged to queue orders so that they can be processed serially by the ERP system.
If the ERP system supports distributed transactions, another option is to leverage Microsoft Message Queue Server (MSMQ). As long as the e-commerce application and the ERP system reside within the same LAN, connectivity via non-Internet protocols is less of an issue. The advantage MSMQ has over SMTP is that messages can be placed and removed from the queue within the scope of a transaction. If an attempt to process a message that was pulled off the queue fails, the message will automatically be placed back in the queue when the transaction aborts.
Choosing an Encoding Scheme
HTTP and SMTP provide a means of sending data between the client and the server. However, neither specifies how the data within the body of the message should be encoded. Microsoft needed a standard, platform-neutral way to encode data exchanged between the client and the server.
Because the goal was to leverage Internet-based protocols, Extensible Markup Language (XML) was the natural choice. XML offers many advantages, including cross-platform support, a common type system, and support for industry -standard character sets.
Binary encoding schemes such as those used by DCOM, CORBA, and Java RMI must address compatibility issues between different hardware platforms. For example, different hardware platforms have different internal binary representation of multi-byte numbers. Intel platforms order the bytes of a multi-byte number using the little endian convention; many RISC processors order the bytes of a multi-byte number using the big endian convention.
XML avoids binary encoding issues because it uses a text-based encoding scheme that leverages standard character sets. Also, some transport protocols, such as SMTP, can contain only text-based messages.
Binary methods of encoding, such as those used by DCOM and CORBA, are cumbersome and require a supporting infrastructure to abstract the developer from the details. XML is much lighter weight and easier to handle because it can be created and consumed using standard text-parsing techniques.
In addition, a variety of XML parsers are available to further simplify the creation and consumption of XML documents on practically every modern platform. XML is lightweight and has excellent tool support, so XML encoding allows incredible reach because practically any client on any platform can communicate with your Web service.
Choosing a Formatting Convention
It is often necessary to include additional metadata with the body of the message. For example, you might want to include information about the type of services that a Web service needs to provide in order to fulfill your request, such as enlisting in a transaction or routing information. XML provides no mechanism for differentiating the body of the message from its associated data.
Transport protocols such as HTTP provide an extensible mechanism for header data, but some data associated with the message might not be specific to the transport protocol. For example, the client might send a message that needs to be routed to multiple destinations, potentially over different transport protocols. If the routing information were placed into an HTTP header, it would have to be translated before being sent to the next intermediary over another transport protocol, such as SMTP. Because the routing information is specific to the message and not the transport protocol, it should be a part of the message.
Simple Object Access Protocol (SOAP) provides a protocol-agnostic means of associating header information with the body of the message. Every SOAP message must define an envelope. The envelope has a body that contains the payload of the message and a header that can contain metadata associated with the message.
SOAP imposes no restrictions on how the message body can be formatted. This is a potential concern because without a consistent way of encoding the data, it is difficult to develop a toolset that abstracts you from the underlying protocols. You might have to spend a fair amount of time getting up to speed on the Web service’s interface instead of solving the business problem at hand.
What was needed was a standard way of formatting a remote procedure call (RPC) message and encoding its list of parameters. This is exactly what Section 7 of the SOAP specification provides. It describes a standard naming convention and encoding style for procedure-oriented messages.
Because SOAP provides a standard format for serializing data into an XML message, platforms such as ASP.NET and Remoting can abstract away the details for you.
Choosing Description Mechanisms
SOAP provides a standard way of formatting messages exchanged between the Web service and the client. However, the client needs additional information in order to properly serialize the request and interpret the response. XML Schema provides a means of creating schemas that can be used to describe the contents of a message.
XML Schema provides a core set of built-in datatypes that can be used to describe the contents of a message. You can also create your own datatypes. For example, the merchant bank can create a complex datatype to describe the content and structure of the body of a message used to submit a credit card payment request.
A schema contains a set of datatype and element definitions. A Web service uses the schema not only to communicate the type of data that is expected to be within a message but also to validate incoming and outgoing messages.
A schema alone does not provide enough information to effectively describe a Web service, however. The schema does not describe the message patterns between the client and the server. For example, a client needs to know whether to expect a response when an order is posted to the ERP system. A client also needs to know over what transport protocol the Web service expects to receive requests. Finally, the client needs to know the address where the Web service can be reached.
This information is provided by a Web Services Description Language (WSDL) document. WSDL is an XML document that fully describes a particular Web service. Tools such as ASP.NET WSDL.exe and Remoting SOAPSUDS.exe can consume WSDL and automatically build proxies for the developer.
As with any component used to build software, a Web service should also be accompanied by written documentation for developers who program against the Web service. The documentation should describe what the Web service does, the interfaces it exposes, and some examples of how to use it. Good documentation is especially important if the Web service is exposed to clients over the Internet.
Choosing Discovery Mechanisms
Once you’ve developed and documented a Web service, how can potential clients locate it? If the Web service is designed to be consumed by a member of your development team, your approach can be pretty informal, such as sharing the URL of the WSDL document with your peer a couple of cubicles down. But when potential clients are on the Internet, advertising your Web service effectively is an entirely different story.
What’s needed is a common way to advertise Web services. Universal Description, Discovery, and Integration (UDDI) provides just such a mechanism. UDDI is an industry-standard centralized directory service that can be used to advertise and locate Web services. UDDI allows users to search for Web services using a host of search criteria, including company name, category, and type of Web service.
Web services can also be advertised via DISCO, a proprietary XML document format defined by Microsoft that allows Web sites to advertise the services they expose. DISCO defines a simple protocol for facilitating a hyperlink style for locating resources. The primary consumer of DISCO is Microsoft Visual Studio.NET. A developer can target a particular Web server and navigate through the various Web services exposed by the server.
What’s Missing from Web Services?
You might have noticed that some key items found within a distributed component infrastructure are not defined by Web services. Two of the more noticeable omissions are a well-defined API for creating and consuming Web services and a set of component services, such as support for distributed transactions. Let’s discuss each of these missing pieces.
- Web service -specific API Most distributed component infrastructures define an API to perform such tasks as initializing the runtime, creating an instance of a component, and reflecting the metadata used to describe the component. Because most high-level programming languages provide some degree of interoperability with C, the API is usually exposed as a flat set of C method signatures. RMI goes so far as to tightly couple its API with a single high-level language, Java.
In an effort to ensure that Web services are programming language-agnostic, Microsoft has left it up to individual software vendors to bind support for Web services to a particular platform. I will discuss two Web service implementations for the.NET platform, ASP.NET and Remoting, later in the book.
- Component services The Web services platform does not provide many of the services commonly found in distributed component infrastructures, such as remote object lifetime management, object pooling, and support for distributed transactions. These services are left up to the distributed component infrastructure to implement.
Some services, such as support for distributed transactions, can be introduced later as the technology matures. Others, such as object pooling and possibly object lifetime management, can be considered an implementation detail of the platform. For example, Remoting defines extensions to provide support for object lifetime management, and Microsoft Component Services provides support for object pooling.
Summary
Component-based programming has proven to be a boon to developer productivity, but some services cannot be encapsulated by a component that resides within the client’s datacenter. Legacy technologies such as DCOM, CORBA, and Java RMI are ill-suited to allowing clients to access services over the Internet, so Microsoft found it necessary to start from the bottom and build an industry-standard way of accessing remote services.
Webtjenester is an umbrella term that describes a collection of industry- standard protocols and services used to facilitate a base-line level of interoperability between applications. The industry support that Web services has received is unprecedented. Never before have so many leading technology companies stepped up to support a standard that facilitates interoperability between applications, regardless of the platform on which they are run.
One of the contributing factors to the success of Web services is that they’re built on existing Internet standards such as XML and HTTP. As a result, any system capable of parsing text and communicating via a standard Internet transport protocol can communicate with a Web service. Companies can also leverage the investment they have already made in these technologies.
