Обсуждение содержимого (content negotiation) rfc 2068


Содержание

Hypertext Transfer Protocol — HTTP/1.1

Status of this Memo

This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the «Internet Official Protocol Standards» (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.

Copyright (C) The Internet Society (1999). All Rights Reserved.

Abstract

The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers [47]. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.

HTTP has been in use by the World-Wide Web global information initiative since 1990. This specification defines the protocol referred to as «HTTP/1.1», and is an update to RFC 2068 [33].

Spring boot controller content negotiation

I have a simple REST controller written in a Spring-boot application but I am not sure how to implement the content negotiation to make it return JSON or XML based on the Content-Type parameter in the request header. Could someone explain to me, what am I doing wrong?

I always get JSON when calling this method (even if I specify the Content-Type to be application/xml or text/xml ).

When I implement two methods each with different mapping and different content type, I am able to get XML from the xml one but it does not work if I specify two mediaTypes in a single method (like the provided example).

What I would like is to call the \message endpoint and receive

  • XML when the Content-Type of the GET request is set to application/xml
  • JSON when the Content-Type is application/json

Any help is appreciated.

EDIT: I updated my controller to accept all media types

2 Answers 2

You can find some hints in the blog post @RequestMapping with Produces and Consumes at point 6.

Pay attention to the section about Content-Type and Accept headers:

@RequestMapping with Produces and Consumes: We can use header Content-Type and Accept to find out request contents and what is the mime message it wants in response. For clarity, @RequestMapping provides produces and consumes variables where we can specify the request content-type for which method will be invoked and the response content type. For example:

Above method can consume message only with Content-Type as text/html and is able to produce messages of type application/json and application/xml.

You can also try this different approach (using ResponseEntity object) that allows you to find out the incoming message type and produce the corresponding message (also expoliting the @ResponseBody annotation)

Status of this Memo

By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress”.

The list of current Internet-Drafts can be accessed at .

The list of Internet-Draft Shadow Directories can be accessed at .

This Internet-Draft will expire in May 2008.

Copyright В© The IETF Trust (2007). All Rights Reserved.

Abstract

The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. HTTP has been in use by the World Wide Web global information initiative since 1990. This document is Part 3 of the eight-part specification that defines the protocol referred to as «HTTP/1.1» and, taken together, updates RFC 2616 and RFC 2617. Part 3 defines HTTP message content, metadata, and content negotiation.

Table of Contents

  • 1. Introduction
  • 2. Protocol Parameters
    • 2.1 Character Sets
      • 2.1.1 Missing Charset
    • 2.2 Content Codings
    • 2.3 Media Types
      • 2.3.1 Canonicalization and Text Defaults
      • 2.3.2 Multipart Types
    • 2.4 Quality Values
    • 2.5 Language Tags
  • 3. Entity
    • 3.1 Entity Header Fields
    • 3.2 Entity Body
      • 3.2.1 Type
      • 3.2.2 Entity Length
  • 4. Content Negotiation
    • 4.1 Server-driven Negotiation
    • 4.2 Agent-driven Negotiation
    • 4.3 Transparent Negotiation
  • 5. Header Field Definitions
    • 5.1 Accept
    • 5.2 Accept-Charset
    • 5.3 Accept-Encoding
    • 5.4 Accept-Language
    • 5.5 Content-Encoding
    • 5.6 Content-Language
    • 5.7 Content-Location
    • 5.8 Content-MD5
    • 5.9 Content-Type
  • 6. IANA Cons >This document will define aspects of HTTP related to the payload of messages (message content), including metadata and media types, along with HTTP content negotiation. Right now it only includes the extracted relevant sections of RFC 2616 without edit.

    2. Protocol Parameters

    2.1 Character Sets

    HTTP uses the same definition of the term «character set» as that described for MIME:

    The term «character set» is used in this document to refer to a method used with one or more tables to convert a sequence of octets into a sequence of characters. Note that unconditional conversion in the other direction is not required, in that not all characters may be available in a given character set and a character set may prov >MUST fully specify the mapping to be performed from octets to characters. In particular, use of external profiling information to determine the exact mapping is not permitted.

    Note: This use of the term «character set» is more commonly referred to as a «character encoding.» However, since HTTP and MIME share the same registry, it is important that the terminology also be shared.

    Although HTTP allows an arbitrary token to be used as a charset value, any token that has a predefined value within the IANA Character Set registry [RFC1700] MUST represent the character set defined by that registry. Applications SHOULD limit their use of character sets to those defined by the IANA registry.

    HTTP uses charset in two contexts: within an Accept-Charset request header (in which the charset value is an unquoted token) and as the value of a parameter in a Content-type header (within a request or response), in which case the parameter value of the charset parameter may be quoted.

    Implementors should be aware of IETF character set requirements [RFC2279] [RFC2277] .

    2.1.1 Missing Charset

    Some HTTP/1.0 software has interpreted a Content-Type header without charset parameter incorrectly to mean «recipient should guess.» Senders wishing to defeat this behavior MAY include a charset parameter even when the charset is ISO-8859-1 and SHOULD do so when it is known that it will not confuse the recipient.

    Unfortunately, some older HTTP/1.0 clients d >MUST respect the charset label prov >MUST use the charset from the content-type field if they support that charset, rather than the recipient’s preference, when initially displaying a document. See Section 2.3.1.

    2.2 Content Codings

    Content coding values indicate an encoding transformation that has been or can be applied to an entity. Content codings are primarily used to allow a document to be compressed or otherwise usefully transformed without losing the identity of its underlying media type and without loss of information. Frequently, the entity is stored in coded form, transmitted directly, and only decoded by the recipient.

    All content-coding values are case-insensitive. HTTP/1.1 uses content-coding values in the Accept-Encoding (Section 5.3) and Content-Encoding (Section 5.5) header fields. Although the value describes the content-coding, what is more important is that it indicates what decoding mechanism will be required to remove the encoding.

    The Internet Assigned Numbers Authority (IANA) acts as a registry for content-coding value tokens. Initially, the registry contains the following tokens:

    An encoding format produced by the file compression program «gzip» (GNU zip) as described in RFC 1952 [RFC1952] . This format is a Lempel-Ziv coding (LZ77) with a 32 bit CRC.

    The encoding format produced by the common UNIX file compression program «compress». This format is an adaptive Lempel-Ziv-Welch coding (LZW). Use of program names for the >SHOULD consider «x-gzip» and «x-compress» to be equivalent to «gzip» and «compress» respectively.

    The «zlib» format defined in RFC 1950 [RFC1950] in combination with the «deflate» compression mechanism described in RFC 1951 [RFC1951] .

    The default ( >SHOULD NOT be used in the Content-Encoding header.

    New content-coding value tokens SHOULD be registered; to allow interoperability between clients and servers, specifications of the content coding algorithms needed to implement a new value SHOULD be publicly available and adequate for independent implementation, and conform to the purpose of content coding defined in this section.

    2.3 Media Types

    HTTP uses Internet Media Types [RFC4288] in the Content-Type (Section 5.9) and Accept (Section 5.1) header fields in order to provide open and extensible data typing and type negotiation.

    Parameters MAY follow the type/subtype in the form of attribute/value pairs.

    The type, subtype, and parameter attribute names are case-insensitive. Parameter values might or might not be case-sensitive, depending on the semantics of the parameter name. Linear white space (LWS) MUST NOT be used between the type and subtype, nor between an attribute and its value. The presence or absence of a parameter might be significant to the processing of a media-type, depending on its definition within the media type registry.

    Note that some older HTTP applications do not recognize media type parameters. When sending data to older HTTP applications, implementations SHOULD only use media type parameters when they are required by that type/subtype definition.

    Media-type values are registered with the Internet Assigned Number Authority (IANA [RFC1700] ). The media type registration process is outlined in RFC 4288 [RFC4288] . Use of non-registered media types is discouraged.

    2.3.1 Canonicalization and Text Defaults

    Internet media types are registered with a canonical form. An entity-body transferred via HTTP messages MUST be represented in the appropriate canonical form prior to its transmission except for «text» types, as defined in the next paragraph.

    When in canonical form, media subtypes of the «text» type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body. HTTP applications MUST accept CRLF, bare CR, and bare LF as being representative of a line break in text media received via HTTP. In addition, if the text is represented in a character set that does not use octets 13 and 10 for CR and LF respectively, as is the case for some multi-byte character sets, HTTP allows the use of whatever octet sequences are defined by that character set to represent the equivalent of CR and LF for line breaks. This flexibility regarding line breaks applies only to text media in the entity-body; a bare CR or LF MUST NOT be substituted for CRLF within any of the HTTP control structures (such as header fields and multipart boundaries).

    If an entity-body is encoded with a content-coding, the underlying data MUST be in a form defined above prior to being encoded.

    The «charset» parameter is used with some media types to define the character set (Section 2.1) of the data. When no explicit charset parameter is prov >MUST be labeled with an appropriate charset value. See Section 2.1.1 for compatibility problems.

    2.3.2 Multipart Types

    MIME prov >[RFC2046] , and MUST include a boundary parameter as part of the media type value. The message body is itself a protocol element and MUST therefore use only CRLF to represent line breaks between body-parts. Unlike in RFC 2046, the epilogue of any multipart message MUST be empty; HTTP applications MUST NOT transmit the epilogue (even if the original multipart contains an epilogue). These restrictions exist in order to preserve the self-delimiting nature of a multipart message-body, wherein the «end» of the message-body is indicated by the ending multipart boundary.

    In general, HTTP treats a multipart message-body no differently than any other media type: strictly as payload. The one exception is the «multipart/byteranges» type ([Part 5]) when it appears in a 206 (Partial Content) response. In all other cases, an HTTP user agent SHOULD follow the same or similar behavior as a MIME user agent would upon receipt of a multipart type. The MIME header fields within each body-part of a multipart message-body do not have any significance to HTTP beyond that defined by their MIME semantics.

    In general, an HTTP user agent SHOULD follow the same or similar behavior as a MIME user agent would upon receipt of a multipart type. If an application receives an unrecognized multipart subtype, the application MUST treat it as being equivalent to «multipart/mixed».

    Note: The «multipart/form-data» type has been specifically defined for carrying form data suitable for processing via the POST request method, as described in RFC 1867 [RFC1867] .

    2.4 Quality Values

    HTTP content negotiation (Section 4) uses short «floating point» numbers to indicate the relative importance («weight») of various negotiable parameters. A weight is normalized to a real number in the range 0 through 1, where 0 is the minimum and 1 the maximum value. If a parameter has a quality value of 0, then content with this parameter is `not acceptable’ for the client. HTTP/1.1 applications MUST NOT generate more than three digits after the decimal point. User configuration of these values SHOULD also be limited in this fashion.

    «Quality values» is a misnomer, since these values merely represent relative degradation in desired quality.

    2.5 Language Tags

    A language tag identifies a natural language spoken, written, or otherwise conveyed by human beings for communication of information to other human beings. Computer languages are explicitly excluded. HTTP uses language tags within the Accept-Language and Content-Language fields.

    The syntax and registry of HTTP language tags is the same as that defined by RFC 1766 [RFC1766] . In summary, a language tag is composed of 1 or more parts: A primary language tag and a possibly empty series of subtags:

    White space is not allowed within the tag and all tags are case-insensitive. The name space of language tags is administered by the IANA. Example tags include:


    where any two-letter primary-tag is an ISO-639 language abbreviation and any two-letter initial subtag is an ISO-3166 country code. (The last three tags above are not registered tags; all but the last are examples of tags which could be registered in future.)

    3. Entity

    Request and Response messages MAY transfer an entity if not otherwise restricted by the request method or response status code. An entity consists of entity-header fields and an entity-body, although some responses will only include the entity-headers.

    In this section, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity.

    3.1 Entity Header Fields

    Entity-header fields define metainformation about the entity-body or, if no body is present, about the resource >OPTIONAL; some might be REQUIRED by portions of this specification.

    The extension-header mechanism allows additional entity-header fields to be defined without changing the protocol, but these fields cannot be assumed to be recognizable by the recipient. Unrecognized header fields SHOULD be ignored by the recipient and MUST be forwarded by transparent proxies.

    3.2 Entity Body

    The entity-body (if any) sent with an HTTP request or response is in a format and encoding defined by the entity-header fields.

    An entity-body is only present in a message when a message-body is present, as described in [Part 1]. The entity-body is obtained from the message-body by decoding any Transfer-Encoding that might have been applied to ensure safe and proper transfer of the message.

    3.2.1 Type

    When an entity-body is included with a message, the data type of that body is determined via the header fields Content-Type and Content-Encoding. These define a two-layer, ordered encoding model:

    Content-Type specifies the media type of the underlying data. Content-Encoding may be used to indicate any additional content codings applied to the data, usually for the purpose of data compression, that are a property of the requested resource. There is no default encoding.

    Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to >SHOULD treat it as type «application/octet-stream».

    3.2.2 Entity Length

    The entity-length of a message is the length of the message-body before any transfer-codings have been applied. [Part 1] defines how the transfer-length of a message-body is determined.

    4. Content Negotiation

    Most HTTP responses include an entity which contains information for interpretation by a human user. Naturally, it is desirable to supply the user with the «best available» entity corresponding to the request. Unfortunately for servers and caches, not all users have the same preferences for what is «best,» and not all user agents are equally capable of rendering all entity types. For that reason, HTTP has provisions for several mechanisms for «content negotiation» — the process of selecting the best representation for a given response when there are multiple representations available.

    Note: This is not called «format negotiation» because the alternate representations may be of the same media type, but use different capabilities of that type, be in different languages, etc.

    Any response containing an entity-body MAY be subject to negotiation, including error responses.

    There are two kinds of content negotiation which are possible in HTTP: server-driven and agent-driven negotiation. These two kinds of negotiation are orthogonal and thus may be used separately or in combination. One method of combination, referred to as transparent negotiation, occurs when a cache uses the agent-driven negotiation information provided by the origin server in order to provide server-driven negotiation for subsequent requests.

    4.1 Server-driven Negotiation

    If the selection of the best representation for a response is made by an algorithm located at the server, it is called server-driven negotiation. Selection is based on the available representations of the response (the dimensions over which it can vary; e.g. language, content-coding, etc.) and the contents of particular header fields in the request message or on other information pertaining to the request (such as the network address of the client).

    Server-driven negotiation is advantageous when the algorithm for selecting from among the available representations is difficult to describe to the user agent, or when the server desires to send its «best guess» to the client along with the first response (hoping to avo >MAY include request header fields (Accept, Accept-Language, Accept-Encoding, etc.) which describe its preferences for such a response.

    Server-driven negotiation has disadvantages:

    1. It is impossible for the server to accurately determine what might be «best» for any given user, since that would require complete knowledge of both the capabilities of the user agent and the intended use for the response (e.g., does the user want to view it on screen or print it on paper?).
    2. Having the user agent describe its capabilities in every request can be both very inefficient (given that only a small percentage of responses have multiple representations) and a potential violation of the user’s privacy.
    3. It complicates the implementation of an origin server and the algorithms for generating responses to a request.
    4. It may limit a public cache’s ability to use the same response for multiple user’s requests.

    HTTP/1.1 includes the following request-header fields for enabling server-driven negotiation through description of user agent capabilities and user preferences: Accept (Section 5.1), Accept-Charset (Section 5.2), Accept-Encoding (Section 5.3), Accept-Language (Section 5.4), and User-Agent ([Part 2]). However, an origin server is not limited to these dimensions and MAY vary the response based on any aspect of the request, including information outside the request-header fields or within extension header fields not defined by this specification.

    The Vary header field [Part 6] can be used to express the parameters the server uses to select a representation that is subject to server-driven negotiation.

    4.2 Agent-driven Negotiation

    With agent-driven negotiation, selection of the best representation for a response is performed by the user agent after receiving an initial response from the origin server. Selection is based on a list of the available representations of the response included within the header fields or entity-body of the initial response, with each representation identified by its own URI. Selection from among the representations may be performed automatically (if the user agent is capable of doing so) or manually by the user selecting from a generated (possibly hypertext) menu.

    Agent-driven negotiation is advantageous when the response would vary over commonly-used dimensions (such as type, language, or encoding), when the origin server is unable to determine a user agent’s capabilities from examining the request, and generally when public caches are used to distribute server load and reduce network usage.

    Agent-driven negotiation suffers from the disadvantage of needing a second request to obtain the best alternate representation. This second request is only efficient when caching is used. In addition, this specification does not define any mechanism for supporting automatic selection, though it also does not prevent any such mechanism from being developed as an extension and used within HTTP/1.1.

    HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) status codes for enabling agent-driven negotiation when the server is unwilling or unable to provide a varying response using server-driven negotiation.

    4.3 Transparent Negotiation

    Transparent negotiation is a combination of both server-driven and agent-driven negotiation. When a cache is supplied with a form of the list of available representations of the response (as in agent-driven negotiation) and the dimensions of variance are completely understood by the cache, then the cache becomes capable of performing server-driven negotiation on behalf of the origin server for subsequent requests on that resource.

    Transparent negotiation has the advantage of distributing the negotiation work that would otherwise be required of the origin server and also removing the second request delay of agent-driven negotiation when the cache is able to correctly guess the right response.

    This specification does not define any mechanism for transparent negotiation, though it also does not prevent any such mechanism from being developed as an extension that could be used within HTTP/1.1.

    5. Header Field Definitions

    This section defines the syntax and semantics of all standard HTTP/1.1 header fields. For entity-header fields, both sender and recipient refer to either the client or the server, depending on who sends and who receives the entity.

    5.1 Accept

    The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image.

    The asterisk «*» character is used to group media types into ranges, with «*/*» indicating all media types and «type/*» indicating all subtypes of that type. The media-range MAY include media type parameters that are applicable to that range.

    Each media-range MAY be followed by one or more accept-params, beginning with the «q» parameter for indicating a relative quality factor. The first «q» parameter (if any) separates the media-range parameter(s) from the accept-params. Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (Section 2.4). The default value is q=1.

    Note: Use of the «q» parameter name to separate media type parameters from Accept extension parameters is due to historical practice. Although this prevents any media type parameter named «q» from being used with a media range, such an event is believed to be unlikely given the lack of any «q» parameters in the IANA media type registry and the rare usage of any media type parameters in Accept. Future media types are discouraged from registering any parameter named «q».

    SHOULD be interpreted as «I prefer audio/basic, but send me any audio type if it is the best available after an 80% mark-down in quality.»

    If no Accept header field is present, then it is assumed that the client accepts all media types. If an Accept header field is present, and if the server cannot send a response which is acceptable according to the combined Accept field value, then the server SHOULD send a 406 (not acceptable) response.

    A more elaborate example is

    Verbally, this would be interpreted as «text/html and text/x-c are the preferred media types, but if they do not exist, then send the text/x-dvi entity, and if that does not exist, send the text/plain entity.»

    Media ranges can be overridden by more specific media ranges or specific media types. If more than one media range applies to a given type, the most specific reference has precedence. For example,

    have the following precedence:

    The media type quality factor associated with a given type is determined by finding the media range with the highest precedence which matches that type. For example,

    would cause the following values to be associated:

    Note: A user agent might be provided with a default set of quality values for certain media ranges. However, unless the user agent is a closed system which cannot interact with other rendering agents, this default set ought to be configurable by the user.

    5.2 Accept-Charset

    The Accept-Charset request-header field can be used to indicate what character sets are acceptable for the response. This field allows clients capable of understanding more comprehensive or special-purpose character sets to signal that capability to a server which is capable of representing documents in those character sets.

    Character set values are described in Section 2.1. Each charset MAY be given an associated quality value which represents the user’s preference for that charset. The default value is q=1. An example is

    The special value «*», if present in the Accept-Charset field, matches every character set (including ISO-8859-1) which is not mentioned elsewhere in the Accept-Charset field. If no «*» is present in an Accept-Charset field, then all character sets not explicitly mentioned get a quality value of 0, except for ISO-8859-1, which gets a quality value of 1 if not explicitly mentioned.

    If no Accept-Charset header is present, the default is that any character set is acceptable. If an Accept-Charset header is present, and if the server cannot send a response which is acceptable according to the Accept-Charset header, then the server SHOULD send an error response with the 406 (not acceptable) status code, though the sending of an unacceptable response is also allowed.

    5.3 Accept-Encoding

    The Accept-Encoding request-header field is similar to Accept, but restricts the content-codings (Section 2.2) that are acceptable in the response.

    Examples of its use are:

    A server tests whether a content-coding is acceptable, according to an Accept-Encoding field, using these rules:

    1. If the content-coding is one of the content-codings listed in the Accept-Encoding field, then it is acceptable, unless it is accompanied by a qvalue of 0. (As defined in Section 2.4, a qvalue of 0 means «not acceptable.»)
    2. The special «*» symbol in an Accept-Encoding field matches any available content-coding not explicitly listed in the header field.
    3. If multiple content-codings are acceptable, then the acceptable content-coding with the highest non-zero qvalue is preferred.
    4. The » >If an Accept-Encoding field is present in a request, and if the server cannot send a response which is acceptable according to the Accept-Encoding header, then the server SHOULD send an error response with the 406 (Not Acceptable) status code.

    If no Accept-Encoding field is present in a request, the server MAY assume that the client will accept any content coding. In this case, if » >SHOULD use the «identity» content-coding, unless it has additional information that a different content-coding is meaningful to the client.

    Note: If the request does not include an Accept-Encoding field, and if the «identity» content-coding is unavailable, then content-codings commonly understood by HTTP/1.0 clients (i.e., «gzip» and «compress») are preferred; some older clients improperly display messages sent with other content-codings. The server might also make this decision based on information about the particular user-agent or client. Note: Most HTTP/1.0 applications do not recognize or obey qvalues associated with content-codings. This means that qvalues will not work and are not permitted with x-gzip or x-compress.

    5.4 Accept-Language

    The Accept-Language request-header field is similar to Accept, but restricts the set of natural languages that are preferred as a response to the request. Language tags are defined in Section 2.5.

    Each language-range MAY be given an associated quality value which represents an estimate of the user’s preference for the languages specified by that range. The quality value defaults to «q=1». For example,

    would mean: «I prefer Danish, but will accept British English and other types of English.» A language-range matches a language-tag if it exactly equals the tag, or if it exactly equals a prefix of the tag such that the first tag character following the prefix is «-«. The special range «*», if present in the Accept-Language field, matches every tag not matched by any other range present in the Accept-Language field.

    Note: This use of a prefix matching rule does not imply that language tags are assigned to languages in such a way that it is always true that if a user understands a language with a certain tag, then this user will also understand all languages with tags for which this tag is a prefix. The prefix rule simply allows the use of prefix tags if this is the case.

    The language quality factor assigned to a language-tag by the Accept-Language field is the quality value of the longest language-range in the field that matches the language-tag. If no language-range in the field matches the tag, the language quality factor assigned is 0. If no Accept-Language header is present in the request, the server SHOULD assume that all languages are equally acceptable. If an Accept-Language header is present, then all languages which are assigned a quality factor greater than 0 are acceptable.

    It might be contrary to the privacy expectations of the user to send an Accept-Language header with the complete linguistic preferences of the user in every request. For a discussion of this issue, see Section 7.1.

    As intelligibility is highly dependent on the indiv >MUST NOT be given in the request.

    Note: When making the choice of linguistic preference available to the user, we remind implementors of the fact that users are not familiar with the details of language matching as described above, and should provide appropriate guidance. As an example, users might assume that on selecting «en-gb», they will be served any kind of English document if British English is not available. A user agent might suggest in such a case to add «en» to get the best matching behavior.

    5.5 Content-Encoding

    The Content-Encoding entity-header field is used as a modifier to the media-type. When present, its value indicates what additional content codings have been applied to the entity-body, and thus what decoding mechanisms must be applied in order to obtain the media-type referenced by the Content-Type header field. Content-Encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type.

    Content codings are defined in Section 2.2. An example of its use is

    The content-coding is a characteristic of the entity >MAY modify the content-coding if the new coding is known to be acceptable to the recipient, unless the «no-transform» cache-control directive is present in the message.

    If the content-coding of an entity is not » >MUST include a Content-Encoding entity-header (Section 5.5) that lists the non-identity content-coding(s) used.

    If the content-coding of an entity in a request message is not acceptable to the origin server, the server SHOULD respond with a status code of 415 (Unsupported Media Type).

    If multiple encodings have been applied to an entity, the content codings MUST be listed in the order in which they were applied. Additional information about the encoding parameters MAY be provided by other entity-header fields not defined by this specification.

    5.6 Content-Language

    The Content-Language entity-header field describes the natural language(s) of the intended audience for the enclosed entity. Note that this might not be equivalent to all the languages used within the entity-body.

    Language tags are defined in Section 2.5. The primary purpose of Content-Language is to allow a user to identify and differentiate entities according to the user’s own preferred language. Thus, if the body content is intended only for a Danish-literate audience, the appropriate field is

    If no Content-Language is specified, the default is that the content is intended for all language audiences. This might mean that the sender does not consider it to be specific to any natural language, or that the sender does not know for which language it is intended.

    Multiple languages MAY be listed for content that is intended for multiple audiences. For example, a rendition of the «Treaty of Waitangi,» presented simultaneously in the original Maori and English versions, would call for

    However, just because multiple languages are present within an entity does not mean that it is intended for multiple linguistic audiences. An example would be a beginner’s language primer, such as «A First Lesson in Latin,» which is clearly intended to be used by an English-literate audience. In this case, the Content-Language would properly only include «en».

    Content-Language MAY be applied to any media type — it is not limited to textual documents.

    5.7 Content-Location

    The Content-Location entity-header field MAY be used to supply the resource location for the entity enclosed in the message when that entity is accessible from a location separate from the requested resource’s URI. A server SHOULD prov >SHOULD provide a Content-Location for the particular variant which is returned.

    The value of Content-Location also defines the base URI for the entity.

    The Content-Location value is not a replacement for the original requested URI; it is only a statement of the location of the resource corresponding to this particular entity at the time of the request. Future requests MAY specify the Content-Location URI as the request-URI if the desire is to identify the source of that particular entity.

    A cache cannot assume that an entity with a Content-Location different from the URI used to retrieve it can be used to respond to later requests on that Content-Location URI. However, the Content-Location can be used to differentiate between multiple entities retrieved from a single requested resource, as described in [Part 6].

    If the Content-Location is a relative URI, the relative URI is interpreted relative to the Request-URI.

    The meaning of the Content-Location header in PUT or POST requests is undefined; servers are free to ignore it in those cases.

    5.8 Content-MD5


    The Content-MD5 entity-header field, as defined in RFC 1864 [RFC1864] , is an MD5 digest of the entity-body for the purpose of providing an end-to-end message integrity check (MIC) of the entity-body. (Note: a MIC is good for detecting accidental modification of the entity-body in transit, but is not proof against malicious attacks.)

    The Content-MD5 header field MAY be generated by an origin server or client to function as an integrity check of the entity-body. Only origin servers or clients MAY generate the Content-MD5 header field; proxies and gateways MUST NOT generate it, as this would defeat its value as an end-to-end integrity check. Any recipient of the entity-body, including gateways and proxies, MAY check that the digest value in this header field matches that of the entity-body as received.

    The MD5 digest is computed based on the content of the entity-body, including any content-coding that has been applied, but not including any transfer-encoding applied to the message-body. If the message is received with a transfer-encoding, that encoding MUST be removed prior to checking the Content-MD5 value against the received entity.

    This has the result that the digest is computed on the octets of the entity-body exactly as, and in the order that, they would be sent if no transfer-encoding were being applied.

    HTTP extends RFC 1864 to permit the digest to be computed for MIME composite media-types (e.g., multipart/* and message/rfc822), but this does not change how the digest is computed as defined in the preceding paragraph.

    There are several consequences of this. The entity-body for composite types MAY contain many body-parts, each with its own MIME and HTTP headers (including Content-MD5, Content-Transfer-Encoding, and Content-Encoding headers). If a body-part has a Content-Transfer-Encoding or Content-Encoding header, it is assumed that the content of the body-part has had the encoding applied, and the body-part is included in the Content-MD5 digest as is — i.e., after the application. The Transfer-Encoding header field is not allowed within body-parts.

    Conversion of all line breaks to CRLF MUST NOT be done before computing or checking the digest: the line break convention used in the text actually transmitted MUST be left unaltered when computing the digest.

    Note: while the definition of Content-MD5 is exactly the same for HTTP as in RFC 1864 for MIME entity-bodies, there are several ways in which the application of Content-MD5 to HTTP entity-bodies differs from its application to MIME entity-bodies. One is that HTTP, unlike MIME, does not use Content-Transfer-Encoding, and does use Transfer-Encoding and Content-Encoding. Another is that HTTP more frequently uses binary content types than MIME, so it is worth noting that, in such cases, the byte order used to compute the digest is the transmission byte order defined for the type. Lastly, HTTP allows transmission of text types with any of several line break conventions and not just the canonical form using CRLF.

    5.9 Content-Type

    The Content-Type entity-header field indicates the media type of the entity-body sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET.

    Media types are defined in Section 2.3. An example of the field is

    Further discussion of methods for identifying the media type of an entity is provided in Section 3.2.1.

    6. IANA Cons >TBD.

    7. Security Cons >This section is meant to inform application developers, information providers, and users of the security limitations in HTTP/1.1 as described by this document. The discussion does not include definitive solutions to the problems revealed, though it does make some suggestions for reducing security risks.

    7.1 Privacy Issues Connected to Accept Headers

    Accept request-headers can reveal information about the user to all servers which are accessed. The Accept-Language header in particular can reveal information the user would consider to be of a private nature, because the understanding of particular languages is often strongly correlated to the membership of a particular ethnic group. User agents which offer the option to configure the contents of an Accept-Language header to be sent in every request are strongly encouraged to let the configuration process include a message which makes the user aware of the loss of privacy involved.

    An approach that limits the loss of privacy would be for a user agent to omit the sending of Accept-Language headers by default, and to ask the user whether or not to start sending Accept-Language headers to a server if it detects, by looking for any Vary response-header fields generated by the server, that such sending could improve the quality of service.

    Elaborate user-customized accept header fields sent in every request, in particular if these include quality values, can be used by servers as relatively reliable and long-lived user >SHOULD warn users about the loss of privacy which can be involved.

    7.2 Content-Disposition Issues

    8. Acknowledgments

    Based on an XML translation of RFC 2616 by Julian Reschke.

    9. References

    [RFC1700] Reynolds, J. and J. Postel, “Assigned Numbers”, STD 2, RFC 1700, October 1994.
    [RFC1766] Alvestrand, H., “Tags for the Identification of Languages”, RFC 1766, March 1995.
    [RFC1806] Troost, R. and S. Dorner, “Communicating Presentation Information in Internet Messages: The Content-Disposition Header”, RFC 1806, June 1995.
    [RFC1864] Myers, J. and M. Rose, “The Content-MD5 Header Field”, RFC 1864, October 1995.
    [RFC1867] Masinter, L. and E. Nebel, “Form-based File Upload in HTML”, RFC 1867, November 1995.
    [RFC1950] Deutsch, L.P. and J-L. Gailly, “ZLIB Compressed Data Format Specification version 3.3”, RFC 1950, May 1996.
    [RFC1951] Deutsch, P., “DEFLATE Compressed Data Format Specification version 1.3”, RFC 1951, May 1996.
    [RFC1952] Deutsch, P., Gailly, J-L., Adler, M., Deutsch, L.P., and G. Randers-Pehrson, “GZIP file format specification version 4.3”, RFC 1952, May 1996.
    [RFC2045] Freed, N. and N.S. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies”, RFC 2045, November 1996.
    [RFC2046] Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types”, RFC 2046, November 1996.
    [RFC2049] Freed, N. and N.S. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples”, RFC 2049, November 1996.
    [RFC2068] Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and T. Berners-Lee, “Hypertext Transfer Protocol — HTTP/1.1”, RFC 2068, January 1997.
    [RFC2076] Palme, J., “Common Internet Message Headers”, RFC 2076, February 1997.
    [RFC2110] Palme, J. and A. Hopmann, “MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)”, RFC 2110, March 1997.
    [RFC2183] Troost, R., Dorner, S., and K. Moore, “Communicating Presentation Information in Internet Messages: The Content-Disposition Header Field”, RFC 2183, August 1997.
    [RFC2277] Alvestrand, H.T., “IETF Policy on Character Sets and Languages”, BCP 18, RFC 2277, January 1998.
    [RFC2279] Yergeau, F., “UTF-8, a transformation format of ISO 10646”, RFC 2279, January 1998.
    [RFC4288] Freed, N. and J. Klensin, “Media Type Specifications and Registration Procedures”, BCP 13, RFC 4288, December 2005.
    [RFC822] Crocker, D.H., “Standard for the format of ARPA Internet text messages”, STD 11, RFC 822, August 1982.

    Authors’ Addresses

    A. Differences Between HTTP Entities and RFC 2045 Entities

    HTTP/1.1 uses many of the constructs defined for Internet Mail (RFC 822 [RFC822] ) and the Multipurpose Internet Mail Extensions (MIME [RFC2045] ) to allow entities to be transmitted in an open variety of representations and with extensible mechanisms. However, RFC 2045 discusses mail, and HTTP has a few features that are different from those described in RFC 2045. These differences were carefully chosen to optimize performance over binary connections, to allow greater freedom in the use of new media types, to make date comparisons easier, and to acknowledge the practice of some early HTTP servers and clients.

    This appendix describes specific areas where HTTP differs from RFC 2045. Proxies and gateways to strict MIME environments SHOULD be aware of these differences and provide the appropriate conversions where necessary. Proxies and gateways from MIME environments to HTTP also need to be aware of the differences because some conversions might be required.

    A.1 MIME-Version

    HTTP is not a MIME-compliant protocol. However, HTTP/1.1 messages MAY include a single MIME-Version general-header field to indicate what version of the MIME protocol was used to construct the message. Use of the MIME-Version header field indicates that the message is in full compliance with the MIME protocol (as defined in RFC 2045 [RFC2045] ). Proxies/gateways are responsible for ensuring full compliance (where possible) when exporting HTTP messages to strict MIME environments.

    MIME version «1.0» is the default for use in HTTP/1.1. However, HTTP/1.1 message parsing and semantics are defined by this document and not the MIME specification.

    A.2 Conversion to Canonical Form

    RFC 2045 [RFC2045] requires that an Internet mail entity be converted to canonical form prior to being transferred, as described in section 4 of RFC 2049 [RFC2049] . Section 2.3.1 of this document describes the forms allowed for subtypes of the «text» media type when transmitted over HTTP. RFC 2046 requires that content with a type of «text» represent line breaks as CRLF and forbids the use of CR or LF outside of line break sequences. HTTP allows CRLF, bare CR, and bare LF to indicate a line break within text content when a message is transmitted over HTTP.

    Where it is possible, a proxy or gateway from HTTP to a strict MIME environment SHOULD translate all line breaks within the text media types described in Section 2.3.1 of this document to the RFC 2049 canonical form of CRLF. Note, however, that this might be complicated by the presence of a Content-Encoding and by the fact that HTTP allows the use of some character sets which do not use octets 13 and 10 to represent CR and LF, as is the case for some multi-byte character sets.

    Implementors should note that conversion will break any cryptographic checksums applied to the original content unless the original content is already in canonical form. Therefore, the canonical form is recommended for any content that uses such checksums in HTTP.

    A.3 Introduction of Content-Encoding

    RFC 2045 does not include any concept equivalent to HTTP/1.1’s Content-Encoding header field. Since this acts as a modifier on the media type, proxies and gateways from HTTP to MIME-compliant protocols MUST either change the value of the Content-Type header field or decode the entity-body before forwarding the message. (Some experimental applications of Content-Type for Internet mail have used a media-type parameter of «;conversions= » to perform a function equivalent to Content-Encoding. However, this parameter is not part of RFC 2045).

    A.4 No Content-Transfer-Encoding

    HTTP does not use the Content-Transfer-Encoding field of RFC 2045. Proxies and gateways from MIME-compliant protocols to HTTP MUST remove any Content-Transfer-Encoding prior to delivering the response message to an HTTP client.

    Proxies and gateways from HTTP to MIME-compliant protocols are responsible for ensuring that the message is in the correct format and encoding for safe transport on that protocol, where «safe transport» is defined by the limitations of the protocol being used. Such a proxy or gateway SHOULD label the data with an appropriate Content-Transfer-Encoding if doing so will improve the likelihood of safe transport over the destination protocol.

    A.5 Introduction of Transfer-Encoding

    HTTP/1.1 introduces the Transfer-Encoding header field ([Part 1]). Proxies/gateways MUST remove any transfer-coding prior to forwarding a message via a MIME-compliant protocol.

    A.6 MHTML and Line Length Limitations

    HTTP implementations which share code with MHTML [RFC2110] implementations need to be aware of MIME line length limitations. Since HTTP does not have this limitation, HTTP does not fold long lines. MHTML messages being transported by HTTP follow all conventions of MHTML, including line length limitations and folding, canonicalization, etc., since HTTP transports all message-bodies as payload (see Section 2.3.2) and does not interpret the content or any MIME header lines that might be contained therein.

    B. Additional Features

    RFC 1945 and RFC 2068 document protocol elements used by some existing HTTP implementations, but not consistently and correctly across most HTTP/1.1 applications. Implementors are advised to be aware of these features, but cannot rely upon their presence in, or interoperability with, other HTTP/1.1 applications. Some of these describe proposed experimental features, and some describe features that experimental deployment found lacking that are now addressed in the base HTTP/1.1 specification.

    A number of other headers, such as Content-Disposition and Title, from SMTP and MIME are also often implemented (see RFC 2076 [RFC2076] ).

    B.1 Content-Disposition

    The Content-Disposition response-header field has been proposed as a means for the origin server to suggest a default filename if the user requests that the content is saved to a file. This usage is derived from the definition of Content-Disposition in RFC 1806 [RFC1806] .

    The receiving user agent SHOULD NOT respect any directory path information present in the filename-parm parameter, which is the only parameter believed to apply to HTTP implementations at this time. The filename SHOULD be treated as a terminal component only.

    If this header is used in a response with the application/octet-stream content-type, the implied suggestion is that the user agent should not display the response, but directly enter a `save response as. ‘ dialog.

    See Section 7.2 for Content-Disposition security issues.

    C. Changes from RFC 2068

    Charset wildcarding is introduced to avoid explosion of character set names in accept headers. (Section 5.2)

    A content-coding of «identity» was introduced, to solve problems discovered in caching. (Section 2.2)

    Quality Values of zero should indicate that «I don’t want something» to allow clients to refuse a representation. (Section 2.4)

    The Alternates , Content-Version , Derived-From , Link , URI , Public and Content-Base header fields were defined in previous versions of this specification, but not commonly implemented. See RFC 2068 [RFC2068] .

    Copyright В© The IETF Trust (2007).

    This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.

    This document and the information contained herein are provided on an “AS IS” basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

    Intellectual Property

    The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.

    Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at .

    The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.

    Acknowledgement

    Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA).

    Spring MVC Content Negotiation

    Last modified: July 14, 2020

    I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

    1. Overview

    This article describes how to implement content negotiation in a Spring MVC project.

    Generally, there are three options to determine the media type of a request:

    • Using URL suffixes (extensions) in the request (eg .xml/.json)
    • Using URL parameter in the request (eg ?format=json)
    • Using Accept header in the request

    By default, this is the order in which the Spring content negotiation manager will try to use these three strategies. And if none of these are enabled, we can specify a fallback to a default content type.

    2. Content Negotiation Strategies

    Let’s start with the necessary dependencies – we are working with JSON and XML representations, so for this article we’ll use Jackson for JSON:

    For XML support we can use either JAXB, XStream or the newer Jackson-XML support.

    Since we have explained the use of the Accept header in an earlier article on HttpMessageConverters, let’s focus on the first two strategies in depth.

    3. The URL Suffix Strategy

    By default, this strategy is disabled, but the framework can check for a path extension right from the URL to determine the output content type.

    Before going into configurations, let’s have a quick look at an example. We have the following simple API method implementation in a typical Spring controller:

    Let’s invoke it making use of the JSON extension to specify the media type of the resource:

    Here’s what we might get back if we use a JSON extension:

    And here’s what the request – response will look like with XML:

    The response body:

    Now if we do not use any extension or use one that is not configured – the default content type will be returned:

    Let’s now have a look at setting up this strategy – both with Java and XML configurations.

    3.1. Java Configuration

    Let’s go over the details.

    First, we’re enabling the path extensions strategy.

    Then, we’re disabling the URL parameter strategy as well as the Accept header strategy – because we want to only rely on the path extension way of determining the type of the content.

    We’re then turning off the Java Activation Framework; JAF can be used as a fallback mechanism to select the output format if the incoming request is not matching any of the strategies we configured. We’re disabling it because we’re going to configure JSON as the default content type.

    And finally – we are setting up JSON to be the default. That means if none of the two strategies are matched, all incoming request will be mapped to a controller method that serves JSON.


    3.2. XML Configuration

    Let’s also have a quick look at the same exact configuration, only using XML:

    4. The URL Parameter Strategy

    We’ve used path extensions in the previous section – let’s now set up Spring MVC to make use of a path parameter.

    We can enable this strategy by setting the value of the property favorParameter to true.

    Let’s have a quick look at how that would work with our previous example:

    And here’s what the JSON response body will be:

    If we use XML parameter, the output will be in XML form:

    The response body:

    Now let’s do the configuration – again, first using Java and then XML.

    4.1. Java Configuration

    Let’s read through this configuration.

    First, of course the path extension and the Accept header strategies are disabled (as well as JAF).

    The rest of the configuration is the same.

    4.2. XML Configuration

    Also we can have both strategies (extension and parameter) enabled at the same time:

    In this case Spring will look for path extension first, if that is not present then will look for path parameter. And if both of these are not available in the input request, then default content type will be returned back.

    5. The Accept Header Strategy

    If Accept header is enabled, Spring MVC will look for its value in the incoming request to determine the representation type.

    We have to set the value of ignoreAcceptHeader to false to enable this approach and we’re disabling the other two strategies just so that we know we’re only relying on the Accept header.

    5.1. Java Configuration

    5.2. XML Configuration

    Finally we need to switch on the content negotiation manager by plug-in it into the overall configuration:

    6. Conclusion

    And we’re done. We looked at how content negotiation works in Spring MVC and we focused on a few examples of setting that up to use various strategies to determine the content type.

    The full implementation of this article can be found in the github project – this is an Eclipse based project, so it should be easy to import and run as it is.

    Как усилить защищенность веб-приложений при помощи HTTP заголовков

    Как мы видели в предыдущих частях этой серии, серверы могут отправлять заголовки HTTP, чтобы предоставить клиенту дополнительные метаданные в ответе, помимо отправки содержимого, запрошенного клиентом. Затем клиентам разрешается указывать, каким образом следует читать, кэшировать или защищать определенный ресурс.

    В настоящее время браузеры внедрили очень широкий спектр заголовков, связанных с безопасностью, чтобы злоумышленникам было труднее использовать уязвимости. В этой статье мы попытаемся обсудить каждый из них, объясняя, как они используются, какие атаки они предотвращают, и немного истории по каждому заголовку.

    HTTP Strict Transport Security (HSTS)

    С конца 2012 года сторонникам “HTTPS Everywhere” стало проще заставить клиента всегда использовать безопасную версию протокола HTTP благодаря Strict Transport Security: очень простая строка Strict-Transport-Security: max-age=3600 скажет браузеру что в течение следующего часа (3600 секунд) он не должен взаимодействовать с приложением по небезопасным протоколам.

    Когда пользователь пытается получить доступ к приложению, защищенному с помощью HSTS через HTTP, браузер просто отказывается идти дальше, автоматически преобразовывая URL-адреса http:// в https:// .

    Вы можете проверить это локально с помощью кода github.com/odino/wasec/tree/master/hsts. Вам нужно будет следовать инструкциям в README (она включают установку доверенного SSL-сертификата для localhost на вашем компьютере с помощью инструмента mkcert), а затем попробуйте открыть https://localhost:7889 .

    В этом примере 2 сервера: HTTPS, который прослушивает 7889 , и HTTP — порт 7888 . Когда вы обращаетесь к HTTPS-серверу, он всегда будет пытаться перенаправить вас на версию HTTP, которая будет работать, поскольку HSTS отсутствует на сервере HTTPS. Если вместо этого вы добавите параметр hsts=on в свой URL, браузер принудительно преобразует ссылку в версию https:// . Поскольку сервер на 7888 доступен только по протоколу http, вы в конечном итоге будете смотреть на страницу, которая выглядит примерно так.

    Вам может быть интересно узнать, что происходит, когда пользователь посещает ваш сайт в первый раз, поскольку заранее не определена политика HSTS: злоумышленники потенциально могут обмануть пользователя по версии http:// вашего сайта и провести там атаку, так что еще есть место для проблем. Это серьезная проблема, поскольку HSTS — это механизм доверия при первом использовании. Он пытается убедиться, что после посещения веб-сайта браузер знает, что при последующем взаимодействии должен использоваться HTTPS.

    Обойти этот недостаток можно было бы путем поддержки огромной базы данных веб-сайтов, поддерживающих HSTS, что Chrome делает через hstspreload.org. Сначала вы должны установить свою политику, а затем посетить веб-сайт и проверить, может ли он быть добавлен в базу данных. Например, мы можем видеть, что Facebook входит в список.

    Отправляя свой веб-сайт в этот список, вы можете заранее сообщить браузерам, что ваш сайт использует HSTS, так что даже первое взаимодействие между клиентами и вашим сервером будет осуществляться по безопасному каналу. Но это обходится дорого, так как вам действительно нужно принять участие в HSTS. Если, по какой-либо причине, вы хотите, чтобы ваш веб-сайт был удален из списка, это непростая задача для поставщиков браузеров:

    Имейте в виду, что включение в список предварительной загрузки не может быть легко отменен.

    Домены могут быть удалены, но для того, чтобы донести до пользователей обновление Chrome, требуются месяцы, и мы не можем дать гарантии относительно других браузеров. Не запрашивайте включение в список, если вы не уверены, что сможете поддерживать HTTPS для всего своего сайта и всех его поддоменов в течение длительного времени.
    — Источник: https://hstspreload.org/

    Это происходит потому, что поставщик не может гарантировать, что все пользователи будут использовать последнюю версию своего браузера, а ваш сайт будет удален из списка. Хорошо подумайте и примите решение, основываясь на вашей степени доверия к HSTS и вашей способности поддерживать его в долгосрочной перспективе.

    HTTP Public Key Pinning (HPKP)

    HTTP Public Key Pinning — это механизм, который позволяет нам сообщать браузеру, какие SSL-сертификаты следует ожидать при подключении к нашим серверам. Это заголовок использует механизм доверия при первом использовании, как и HSTS, и означает, что после подключения клиента к нашему серверу он будет хранить информацию о сертификате для последующих взаимодействий. Если в какой-то момент клиент обнаружит, что сервер использует другой сертификат, он вежливо откажется подключиться, что очень затруднит проведение атак типа «человек посередине» (MITM).

    Вот как выглядит политика HPKP:

    Заголовок объявляет, какие сертификаты сервер будет использовать (в данном случае это два из них), используя хэш сертификатов, и включает дополнительную информацию, такую ​​как время жизни этой директивы ( max-age = 3600 ) и несколько других деталей. К сожалению, нет смысла копать глубже, чтобы понять, что мы можем сделать с закреплением открытого ключа, поскольку Chrome не одобряет эту функцию — сигнал о том, что его принятие обречено на провал.

    Решение Chrome не является иррациональным, это просто следствие рисков, связанных с закреплением открытого ключа. Если вы потеряете свой сертификат или просто ошибетесь во время тестирования, ваш сайт будет недоступен для пользователей, которые посетили сайт ранее (в течение срока действия директивы max-age , которая обычно составляет недели или месяцы).

    В результате этих потенциально катастрофических последствий принятие HPKP было чрезвычайно низким, и были случаи, когда крупные веб-сайты были недоступны из-за неправильной конфигурации. Учитывая все вышесказанное, Chrome решил, что пользователям будет лучше без защиты, предлагаемой HPKP, и исследователи в области безопасности не совсем против этого решения.

    Expect-CT

    В то время как HPKP осуждался, появился новый заголовок, чтобы предотвратить мошеннические SSL-сертификаты для клиентов: Expect-CT .

    Цель этого заголовка — сообщить браузеру, что он должен выполнить дополнительные «фоновые проверки», чтобы убедиться, что сертификат является подлинным: когда сервер использует заголовок Expect-CT , он в основном запрашивает у клиента проверить, что используемые сертификаты находятся в открытых журналах сертификатов прозрачности (CT).

    Инициатива по обеспечению прозрачности сертификатов — это усилия, предпринимаемые Google для обеспечения:

    Открытой платформы для мониторинга и аудита SSL-сертификатов практически в реальном времени.

    В частности, прозрачность сертификатов позволяет обнаруживать сертификаты SSL, которые были ошибочно выданы центром сертификации или злонамеренно получены от другого безупречного центра сертификации. Это также позволяет идентифицировать центры сертификации, которые пошли на мошенничество и злонамеренно выдают сертификаты.
    — Источник: https://www.certificate-transparency.org/

    Заголовок принимает эту форму:

    В этом примере сервер просит браузер:

    • включить проверку CT для текущего приложения на период 1 час (3600 секунд)
    • enforce обеспечить соблюдение этой политики и запретить доступ к приложению в случае нарушения
    • отправить отчет по указанному URL-адресу в случае нарушения

    Целью инициативы «Прозрачность сертификатов» является обнаружение ошибочно выданных или вредоносных сертификатов (и мошеннических центров сертификации) раньше, быстрее и точнее, чем любой другой метод, использовавшийся ранее.

    Включив использование заголовка Expect-CT , вы можете воспользоваться этой инициативой, чтобы улучшить состояние безопасности вашего приложения.

    X-Frame-Options

    Представьте, что вы видите веб-страницу, подобную этой

    Как только вы нажимаете на ссылку, вы понимаете, что все деньги на вашем банковском счете исчезли. Что случилось?

    Вы были жертвой атаки clickjacking.

    Злоумышленник направил вас на свой веб-сайт, на котором отображается очень привлекательная ссылка для нажатия. К сожалению, он также встроил в страницу iframe с your-bank.com/transfer?amount=-1&[attacker@gmail.com] , но скрыл его, установив прозрачность на 0%. Мы подумали, что нажали на исходную страницу, пытаясь выиграть совершенно новый хамер, но вместо этого браузер зафиксировал щелчок по iframe, опасный щелчок, который подтвердил перевод денег.

    Большинство банковских систем требуют, чтобы вы указали одноразовый PIN-код для подтверждения транзакций, но ваш банк не догнал время, и все ваши деньги пропали.

    Пример довольно экстремальный, но он должен дать вам понять, какие могут быть последствия атаки с помощью кликджеккинга. Пользователь намеревается нажать на конкретную ссылку, в то время как браузер вызовет щелчок по «невидимой» странице, которая была встроена в виде фрейма.

    Я включил пример этой уязвимости в github.com/odino/wasec/tree/master/clickjacking. Если вы запустите пример и попробуете нажать на «привлекательную» ссылку, вы увидите, что реальный клик перехватывается iframe, что делает его не прозрачным, чтобы вам было легче обнаружить проблему. Пример должен быть доступен по адресу http://localhost:7888 .

    К счастью, браузеры придумали простое решение этой проблемы: X-Frame-Options (XFO), который позволяет вам решить, можно ли встроить ваше приложение в виде iframe на внешних веб-сайтах. Популяризированная Internet Explorer’ом 8, XFO был впервые представлен в 2009 году и до сих пор поддерживается всеми основными браузерами.

    Это работает так: когда браузер видит iframe, он загружает его и проверяет, что его XFO позволяет включить его в текущую страницу перед его рендерингом.

    • DENY : эта веб-страница нигде не может быть встроена. Это самый высокий уровень защиты, поскольку он никому не позволяет встраивать наш контент.
    • SAMEORIGIN : эту страницу могут вставлять только страницы из того же домена, что и текущий. Это означает, что example.com/embedder может загружать example.com/embedded , если его политика установлена в SAMEORIGIN . Это более спокойная политика, которая позволяет владельцам определенного веб-сайта встраивать свои собственные страницы в свое приложение.
    • ALLOW-FROM uri : вложение разрешено с указанного URI. Мы могли бы, например, позволить внешнему авторизованному веб-сайту встраивать наш контент, используя ALLOW-FROM https://external.com . Обычно это используется, когда вы собираетесь разрешить сторонним разработчикам встраивать ваш контент через iframe

    Пример HTTP-ответа, который включает в себя строжайшую возможную политику XFO, выглядит следующим образом:

    Чтобы продемонстрировать, как ведут себя браузеры, когда XFO включен, мы можем просто изменить URL нашего примера на http://localhost:7888 /?xfo=on . Параметр xfo=on указывает серверу включить в ответ X-Frame-Options: deny , и мы можем увидеть, как браузер ограничивает доступ к iframe:

    XFO считался лучшим способом предотвращения атак с использованием щелчков на основе фреймов до тех пор, пока через несколько лет не вступил в игру еще один заголовок — Content Security Policy или CSP для краткости.

    Content Security Policy (CSP)

    Заголовок Content-Security-Policy , сокращенно CSP, предоставляет утилиты следующего поколения для предотвращения множества атак, от XSS (межсайтовый скриптинг) до перехвата кликов (клик-джеккинга).

    Чтобы понять, как CSP помогает нам, сначала нужно подумать о векторе атаки. Допустим, мы только что создали наш собственный поисковик Google, где есть простое поле для ввода с кнопкой отправки.

    Это веб-приложение не делает ничего волшебного. Оно просто,

    • отображает форму
    • позволяет пользователю выполнить поиск
    • отображает результаты поиска вместе с ключевым словом, которое искал пользователь

    Когда мы выполняем простой поиск, приложение возвращает следующее:

    Удивительно! Наше приложение невероятно поняло наш поиск и нашло похожее изображение. Если мы углубимся в исходный код, доступный по адресу github.com/odino/wasec/tree/master/xss, мы скоро поймем, что приложение представляет проблему безопасности, поскольку любое ключевое слово, которое ищет пользователь, напрямую печатается в HTML:

    Это представляет неприятное следствие. Злоумышленник может создать определенную ссылку, которая выполняет произвольный JavaScript в браузере жертвы.

    Если у вас есть время и терпение, чтобы запустить пример локально, вы сможете быстро понять всю мощь CSP. Я добавил параметр строки запроса, который включает CSP, поэтому мы можем попробовать перейти к вредоносному URL-адресу с включенным CSP:

    Как вы видите в приведенном выше примере, мы сказали браузеру, что наша политика CSP допускает только сценарии, включенные из того же источника текущего URL, что мы можем легко проверить, обратившись к URL с помощью curl и просмотрев заголовок ответа:

    Поскольку XSS-атака осуществлялась с помощью встроенного сценария (сценария, непосредственно встроенного в контент HTML), браузер вежливо отказался выполнить его, обеспечивая безопасность нашего пользователя. Представьте, что вместо простого отображения диалогового окна с предупреждением злоумышленник настроил бы перенаправление на свой собственный домен через некоторый код JavaScript, который мог бы выглядеть следующим образом:

    Они могли бы украсть все пользовательские куки, которые могут содержать очень конфиденциальные данные (подробнее об этом в следующей статье).

    К настоящему времени должно быть ясно, как CSP помогает нам предотвращать ряд атак на веб-приложения. Вы определяете политику, и браузер будет строго придерживаться ее, отказываясь запускать ресурсы, которые будут нарушать политику.

    Интересным вариантом CSP является режим только для отчетов. Вместо того чтобы использовать заголовок Content-Security-Policy , вы можете сначала проверить влияние CSP на ваш сайт, сказав браузеру просто сообщать об ошибках, не блокируя выполнение скрипта и т. д., Используя заголовок Content-Security-Policy-Report-Only .

    Отчеты позволят вам понять, какие критические изменения могут быть вызваны политикой CSP, которую вы хотели бы развернуть, и исправить их соответствующим образом. Мы даже можем указать URL-адрес отчета, и браузер отправит нам отчет. Вот полный пример политики только для отчетов:

    Политики CSP сами по себе могут быть немного сложными, например, в следующем примере:

    Эта политика определяет следующие правила:

    • исполняемые скрипты (например, JavaScript) могут быть загружены только из scripts.example.com
    • изображения могут быть загружены из любого источника ( img-src: * )
    • видео или аудио контент может быть загружен из двух источников: medias.example.com и medias.legacy.example.com

    Как видите, политик может быть много, и если мы хотим обеспечить максимальную защиту для наших пользователей, это может стать довольно утомительным процессом. Тем не менее, написание комплексной политики CSP является важным шагом в направлении добавления дополнительного уровня безопасности для наших веб-приложений.

    Для получения дополнительной информации о CSP я бы порекомендовал developer.mozilla.org/en-US/docs/Web/HTTP/CSP.

    X-XSS-Protection

    Несмотря на то, что он заменен CSP, заголовок X-XSS-Protection обеспечивает аналогичный тип защиты. Этот заголовок используется для смягчения атак XSS в старых браузерах, которые не полностью поддерживают CSP. Этот заголовок не поддерживается Firefox.

    Его синтаксис очень похож на то, что мы только что видели:

    Отраженные XSS — это наиболее распространенный тип атаки, когда введенный текст печатается сервером без какой-либо проверки, и именно там этот заголовок действительно решает. Если вы хотите увидеть это сами, я бы порекомендовал попробовать пример по адресу github.com/odino/wasec/tree/master/xss, так как, добавив xss=on к URL, он показывает, что делает браузер, когда защита от XSS включена. Если мы введем в поле поиска вредоносную строку, такую как , браузер вежливо откажется выполнить скрипт и объяснит причину своего решения:

    Еще более интересным является поведение по умолчанию в Chrome, когда на веб-странице не указана политика CSP или XSS. Сценарий, который мы можем проверить, добавив параметр xss=off в наш URL ( http://localhost:7888/?search=%3Cscript%3Ealert%28%27hello%27%29%3C%2Fscript%3E&xss=off ):

    Удивительно, но Chrome достаточно осторожен, чтобы не допустить рендеринга страницы, что затрудняет создание отраженного XSS. Впечатляет, как далеко зашли браузеры.

    Feature policy


    В июле 2020 года исследователь безопасности Скотт Хельм опубликовал очень интересное сообщение в блоге, в котором подробно описывается новый заголовок безопасности: Feature-Policy .

    В настоящее время поддерживается очень немногими браузерами (Chrome и Safari на момент написания этой статьи), этот заголовок позволяет нам определить, включена ли конкретная функция браузера на текущей странице. С синтаксисом, очень похожим на CSP, у нас не должно быть проблем с пониманием того, что означает политика функций, такая как следующая:

    Если у нас есть все сомнения, то как эта политика влияет на API браузера, мы можем просто проанализировать ее:

    • vibrate ‘self’ : позволит текущей странице использовать vibration API и любому фрейму на текущем сайте.
    • push * : текущая страница и любой фрейм могут использовать API push-уведомлений
    • camera ‘none’ : доступ к API камеры запрещен на данной странице и любых фреймах

    Политика функций имеет небольшую историю. Если ваш сайт позволяет пользователям, например, делать селфи или записывать аудио, было бы весьма полезно использовать политику, которая ограничивает доступ к API через вашу страницу в других контекстах.

    X-Content-Type-Options

    Иногда умные функции браузера в конечном итоге наносят нам вред с точки зрения безопасности. Ярким примером является MIME-сниффинг, методика, популярная в Internet Explorer.

    MIME-сниффинг — это возможность для браузера автоматически обнаруживать (и исправлять) тип содержимого загружаемого ресурса. Например, мы просим браузер визуализировать изображение /awesome-picture.png , но сервер устанавливает неправильный тип при передаче его браузеру (например, Content-Type: text/plain ). Это обычно приводит к тому, что браузер не может правильно отображать изображение.

    Чтобы решить эту проблему, IE приложил много усилий, чтобы реализовать функцию MIME-сниффинга: при загрузке ресурса браузер «сканирует» его и, если обнаружит, что тип контента ресурса не тот, который объявлен сервером в заголовке Content-Type , он игнорирует тип, отправленный сервером, и интерпретирует ресурс в соответствии с типом, обнаруженным браузером.

    Теперь представьте себе хостинг веб-сайта, который позволяет пользователям загружать свои собственные изображения, и представьте, что пользователь загружает файл /test.jpg , содержащий код JavaScript. Видите, куда это идет? Как только файл загружен, сайт включит его в свой собственный HTML и, когда браузер попытается отобразить документ, он найдет «изображение», которое пользователь только что загрузил. Когда браузер загружает изображение, он обнаруживает, что это скрипт, и запускает его в браузере жертвы.

    Чтобы избежать этой проблемы, мы можем установить заголовок X-Content-Type-Options: nosniff , который полностью отключает MIME-сниффинг: тем самым мы сообщаем браузеру, что полностью осознаем, что некоторые файлы могут иметь несоответствие в терминах типа и содержания, и браузер не должен беспокоиться об этом. Мы знаем, что мы делаем, поэтому браузер не должен пытаться угадывать вещи, потенциально создавая угрозу безопасности для наших пользователей.

    Cross-Origin Resource Sharing (CORS)

    В браузере через JavaScript HTTP-запросы могут запускаться только в одном источнике. Проще говоря, AJAX-запрос от example.com может подключаться только к example.com .

    Это связано с тем, что ваш браузер содержит полезную информацию для злоумышленника — файлы cookie, которые обычно используются для отслеживания сеанса пользователя. Представьте, что злоумышленник создаст вредоносную страницу на win-a-hummer.com , которая немедленно вызовет запрос AJAX на your-bank.com . Если вы вошли на веб-сайт банка, злоумышленник сможет выполнить HTTP-запросы с вашими учетными данными, потенциально украсть информацию или, что еще хуже, стереть ваш банковский счет.

    Однако в некоторых случаях может потребоваться выполнение запросов AJAX между разными источниками, и именно по этой причине браузеры реализовали Cross Origin Resource Sharing (CORS), набор директив, позволяющих выполнять запросы между доменами.

    Механизм, лежащий в основе CORS, довольно сложен, и мы не будем практично рассматривать всю спецификацию, поэтому я сосредоточусь на «урезанной» версии CORS.

    Все, что вам нужно знать на данный момент, это то, что с помощью заголовка Access-Control-Allow-Origin ваше приложение сообщает браузеру, то, что можно получать запросы из других источников.

    Наиболее удобной формой этого заголовка является Access-Control-Allow-Origin: * , который позволяет любому источнику получать доступ к нашему приложению, но мы можем ограничить его, просто добавив URL-адрес, который мы хотим добавить в белый список, с помощью Access-Control-Allow-Origin: https://example.com .

    Если мы посмотрим на пример по адресу github.com/odino/wasec/tree/master/cors, мы увидим, как браузер предотвращает доступ к ресурсу из другого источника. Я настроил пример, чтобы сделать запрос AJAX от test-cors к test-cors-2 и вывести результат операции в браузере. Когда сервер test-cors-2 получает указание использовать CORS, страница работает так, как вы ожидаете. Попробуйте перейти на http://cors-test:7888/?cors=on

    Но когда мы удаляем параметр cors из URL, браузер вмешивается и запрещает нам доступ к содержимому ответа:

    Важный аспект, который нам нужно понять, заключается в том, что браузер выполнил запрос, но не позволил клиенту получить к нему доступ. Это чрезвычайно важно, так как это все еще оставляет нас уязвимыми, если наш запрос вызвал бы любой побочный эффект на сервере. Представьте, например, что наш банк разрешил бы перевод денег, просто вызвав ссылку my-bank.com/transfer?amount=1000&from=me&to=attacker. Это было бы катастрофой!

    Как мы видели в начале этой статьи, GET -запросы должны быть идемпотентными, но что произойдет, если мы попытаемся инициировать POST -запрос? К счастью, я включил этот сценарий в пример, поэтому мы можем попробовать его, перейдя по адресу http://cors-test:7888/?method=POST :

    Вместо непосредственного выполнения нашего запроса POST , который потенциально может вызвать серьезные проблемы на сервере, браузер отправил запрос «предварительной проверки». Это не что иное, как запрос OPTIONS к серверу с просьбой проверить, разрешено ли наше происхождение. В этом случае сервер не ответил положительно, поэтому браузер останавливает процесс, и наш запрос POST никогда не достигает цели.

    Это говорит нам пару вещей:

    • CORS — это не простая спецификация. Есть немало сценариев, которые нужно иметь в виду, и вы легко можете запутаться в нюансах таких функций, как предварительные запросы.
    • Никогда не выставляйте API, которые изменяют состояние через GET . Злоумышленник может инициировать эти запросы без предварительного запроса, что означает отсутствие защиты вообще.

    Исходя из своего опыта, я чувствовал себя более комфортно с настройкой прокси-серверов, которые могут перенаправлять запрос на нужный сервер, все на серверной стороне, а не с помощью CORS. Это означает, что ваше приложение, запущенное на example.com , может настроить прокси на example.com/_proxy/other.com , так что все запросы, относящиеся к _proxy/other.com/* , будут перенаправлены на other.com .

    Я завершу свой обзор этой функции здесь, но, если вы заинтересованы в глубоком понимании CORS, у MDN есть очень длинная статья, которая блестяще охватывает всю спецификацию на developer.mozilla.org/en-US/docs/Web/HTTP/CORS.

    X-Permitted-Cross-Domain-Policies

    Сильно связанные с CORS, X-Permitted-Cross-Domain-Policies нацелены на междоменные политики для продуктов Adobe (а именно, Flash и Acrobat).

    Я не буду вдаваться в подробности, так как это заголовок, предназначенный для очень конкретных случаев использования. Короче говоря, продукты Adobe обрабатывают междоменный запрос через файл crossdomain.xml в корневом каталоге домена, на который нацелен запрос, и X-Permitted-Cross-Domain-Policies определяет политики для доступа к этому файлу.

    Звучит сложно? Я бы просто предложил добавить X-Permitted-Cross-Domain-Policies: none и игнорировать клиентов, желающих делать междоменные запросы с помощью Flash.

    Referrer-Policy

    В начале нашей карьеры мы все, вероятно, совершили одну и ту же ошибку. Используйте заголовок Referer , чтобы применить ограничения безопасности на нашем сайте. Если заголовок содержит определенный URL в определенном нами белом списке, мы пропустим пользователей.

    Хорошо, может быть, это был не каждый из нас. Но я чертовски уверен, что сделал эту ошибку тогда. Доверие заголовку Referer для предоставления нам достоверной информации о происхождении пользователя. Заголовок был действительно полезным, пока мы не решили, что отправка этой информации на сайты может представлять потенциальную угрозу для конфиденциальности наших пользователей.

    Заголовок Referrer-Policy , родившийся в начале 2020 года и в настоящее время поддерживаемый всеми основными браузерами, может использоваться для смягчения этих проблем с конфиденциальностью, сообщая браузеру, что он должен только маскировать URL-адрес в заголовке Referer или вообще его не указывать.

    Вот некоторые из наиболее распространенных значений, которые может принимать Referrer-Policy :

    • no-referrer : заголовок Referer будет полностью опущен
    • origin : превращает https://example.com/private-page в https://example.com/
    • same-origin : отправьте Referer на тот же сайт, но пропустите его для всех остальных

    Стоит отметить, что существует намного больше вариаций Referred-Policy ( strict-origin , no-referrer-when-downgrade и т. д.), Но те, которые я упомянул выше, вероятно, будут охватывать большинство ваших вариантов использования. Если вы хотите лучше понять каждый вариант, который вы можете использовать, я бы рекомендовал перейти на страницу OWASP.

    Заголовок Origin очень похож на Referer , так как он отправляется браузером в междоменных запросах, чтобы удостовериться, что вызывающей стороне разрешен доступ к ресурсу в другом домене. Заголовок Origin контролируется браузером, поэтому злоумышленники не смогут вмешаться в него. У вас может возникнуть соблазн использовать его в качестве брандмауэра для вашего веб-приложения: если Origin находится в нашем белом списке, разрешите выполнение запроса.

    Однако следует учитывать, что другие HTTP-клиенты, такие как cURL, могут представлять свое собственное происхождение: простой c url -H «Origin: example.com» api.example.com сделает все правила межсетевого экрана на основе origin неэффективными…… и вот почему вы не можете полагаться на Origin (или Referer , как мы только что видели) для создания брандмауэра для защиты от вредоносных клиентов.

    Тестирование вашей безопасности

    Я хочу завершить эту статью ссылкой на securityheaders.com, невероятно полезный веб-сайт, который позволяет вам убедиться, что в вашем веб-приложении установлены правильные заголовки, связанные с безопасностью. После того, как вы отправите URL, вам будет передана оценка и разбивка заголовок за заголовком. Вот пример отчета для facebook.com:

    Если вы сомневаетесь в том, с чего начать, securityheaders.com — отличное место, чтобы получить первую оценку.

    HTTP с контролем состояния: управление сеансами с файлами cookie

    Эта статья должна была познакомить нас с несколькими интересными заголовками HTTP, что позволило бы нам понять, как они укрепляют наши веб-приложения с помощью специфичных для протокола функций, а также немного помощи от основных браузеров.

    В следующем посте мы углубимся в одну из самых неправильно понятых функций протокола HTTP: куки.

    Рожденные для того, чтобы привести какое-либо состояние в HTTP без сохранения состояния, куки, вероятно, используются (и использовались) каждым из нас для поддержки сеансов в наших веб-приложениях: когда бы ни было какое-либо состояние, которое мы хотели бы сохранить, оно всегда Легко сказать «сохранить его в печенье». Как мы увидим, файлы cookie не всегда являются самыми безопасными из хранилищ, и к ним следует относиться осторожно при работе с конфиденциальной информацией.

    Обсуждение содержимого (content negotiation) / rfc 2068

    Most HTTP responses include an entity which contains information for interpretation by a human user. Naturally, it is desirable to supply the user with the «best available» entity corresponding to the request. Unfortunately for servers and caches, not all users have the same preferences for what is «best,» and not all user agents are equally capable of rendering all entity types. For that reason, HTTP has provisions for several mechanisms for «content negotiation» — the process of selecting the best representation for a given response when there are multiple representations available.

    Any response containing an entity-body MAY be subject to negotiation, including error responses.

    There are two kinds of content negotiation which are possible in HTTP: server-driven and agent-driven negotiation. These two kinds of negotiation are orthogonal and thus may be used separately or in combination. One method of combination, referred to as transparent negotiation, occurs when a cache uses the agent-driven negotiation information provided by the origin server in order to provide server-driven negotiation for subsequent requests.

    12.1 Server-driven Negotiation

    If the selection of the best representation for a response is made by an algorithm located at the server, it is called server-driven negotiation. Selection is based on the available representations of the response (the dimensions over which it can vary; e.g. language, content-coding, etc.) and the contents of particular header fields in the request message or on other information pertaining to the request (such as the network address of the client).

    Server-driven negotiation is advantageous when the algorithm for selecting from among the available representations is difficult to describe to the user agent, or when the server desires to send its «best guess» to the client along with the first response (hoping to avoid the round-trip delay of a subsequent request if the «best guess» is good enough for the user). In order to improve the server’s guess, the user agent MAY include request header fields (Accept, Accept-Language, Accept-Encoding, etc.) which describe its preferences for such a response.

    Server-driven negotiation has disadvantages:

    HTTP/1.1 includes the following request-header fields for enabling server-driven negotiation through description of user agent capabilities and user preferences: Accept (section 14.1), Accept- Charset (section 14.2), Accept-Encoding (section 14.3), Accept- Language (section 14.4), and User-Agent (section 14.43). However, an origin server is not limited to these dimensions and MAY vary the response based on any aspect of the request, including information outside the request-header fields or within extension header fields not defined by this specification.

    The Vary header field can be used to express the parameters the server uses to select a representation that is subject to server- driven negotiation. See section 13.6 for use of the Vary header field by caches and section 14.44 for use of the Vary header field by servers.

    12.2 Agent-driven Negotiation

    With agent-driven negotiation, selection of the best representation for a response is performed by the user agent after receiving an initial response from the origin server. Selection is based on a list of the available representations of the response included within the header fields or entity-body of the initial response, with each representation identified by its own URI. Selection from among the representations may be performed automatically (if the user agent is capable of doing so) or manually by the user selecting from a generated (possibly hypertext) menu.

    Agent-driven negotiation is advantageous when the response would vary over commonly-used dimensions (such as type, language, or encoding), when the origin server is unable to determine a user agent’s capabilities from examining the request, and generally when public caches are used to distribute server load and reduce network usage.

    Agent-driven negotiation suffers from the disadvantage of needing a second request to obtain the best alternate representation. This second request is only efficient when caching is used. In addition, this specification does not define any mechanism for supporting automatic selection, though it also does not prevent any such mechanism from being developed as an extension and used within HTTP/1.1.

    HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) status codes for enabling agent-driven negotiation when the server is unwilling or unable to provide a varying response using server-driven negotiation.

    12.3 Transparent Negotiation

    Transparent negotiation is a combination of both server-driven and agent-driven negotiation. When a cache is supplied with a form of the list of available representations of the response (as in agent-driven negotiation) and the dimensions of variance are completely understood by the cache, then the cache becomes capable of performing server- driven negotiation on behalf of the origin server for subsequent requests on that resource.

    Transparent negotiation has the advantage of distributing the negotiation work that would otherwise be required of the origin server and also removing the second request delay of agent-driven negotiation when the cache is able to correctly guess the right response.

    This specification does not define any mechanism for transparent negotiation, though it also does not prevent any such mechanism from being developed as an extension that could be used within HTTP/1.1.

    Content Negotiation

    Apache’s support for content negotiation has been updated to meet the HTTP/1.1 specification. It can choose the best representation of a resource based on the browser-supplied preferences for media type, languages, character set and encoding. It is also implements a couple of features to give more intelligent handling of requests from browsers which send incomplete negotiation information.

    Content negotiation is provided by the mod_negotiation module, which is compiled in by default.

    About Content Negotiation

    A resource may be available in several different representations. For example, it might be available in different languages or different media types, or a combination. One way of selecting the most appropriate choice is to give the user an index page, and let them select. However it is often possible for the server to choose automatically. This works because browsers can send as part of each request information about what representations they prefer. For example, a browser could indicate that it would like to see information in French, if possible, else English will do. Browsers indicate their preferences by headers in the request. To request only French representations, the browser would send

    Note that this preference will only be applied when there is a choice of representations and they vary by language.

    As an example of a more complex request, this browser has been configured to accept French and English, but prefer French, and to accept various media types, preferring HTML over plain text or other text types, and preferring GIF or JPEG over other media types, but also allowing any other media type as a last resort:

    Apache 1.2 supports ‘server driven’ content negotiation, as defined in the HTTP/1.1 specification. It fully supports the Accept, Accept-Language, Accept-Charset and Accept-Encoding request headers. Apache 1.3.4 also supports ‘transparent’ content negotiation, which is an experimental negotiation protocol defined in RFC 2295 and RFC 2296. It does not offer support for ‘feature negotiation’ as defined in these RFCs.

    A resource is a conceptual entity identified by a URI (RFC 2396). An HTTP server like Apache provides access to representations of the resource(s) within its namespace, with each representation in the form of a sequence of bytes with a defined media type, character set, encoding, etc. Each resource may be associated with zero, one, or more than one representation at any given time. If multiple representations are available, the resource is referred to as negotiable and each of its representations is termed a variant. The ways in which the variants for a negotiable resource vary are called the dimensions of negotiation.

    Negotiation in Apache

    In order to negotiate a resource, the server needs to be given information about each of the variants. This is done in one of two ways:

    • Using a type map (i.e., a *.var file) which names the files containing the variants explicitly, or
    • Using a ‘MultiViews’ search, where the server does an implicit filename pattern match and chooses from among the results.

    Using a type-map file

    A type map is a document which is associated with the handler named type-map (or, for backwards-compatibility with older Apache configurations, the mime type application/x-type-map ). Note that to use this feature, you must have a handler set in the configuration that defines a file suffix as type-map ; this is best done with a

    in the server configuration file. See the comments in the sample config file for more details.

    Type map files have an entry for each available variant; these entries consist of contiguous HTTP-format header lines. Entries for different variants are separated by blank lines. Blank lines are illegal within an entry. It is conventional to begin a map file with an entry for the combined entity as a whole (although this is not required, and if present will be ignored). An example map file is shown below. In this example, the file would be named foo.var and would be placed in the same directory with the various variants of the resource foo .

    If the variants have different source qualities, that may be indicated by the «qs» parameter to the media type, as in this picture (available as jpeg, gif, or ASCII-art):

    qs values can vary in the range 0.000 to 1.000. Note that any variant with a qs value of 0.000 will never be chosen. Variants with no ‘qs’ parameter value are given a qs factor of 1.0. The qs parameter indicates the relative ‘quality’ of this variant compared to the other available variants, independent of the client’s capabilities. For example, a jpeg file is usually of higher source quality than an ascii file if it is attempting to represent a photograph. However, if the resource being represented is an original ascii art, then an ascii representation would have a higher source quality than a jpeg representation. A qs value is therefore specific to a given variant depending on the nature of the resource it represents.

    The full list of headers recognized is:

    URI: uri of the file containing the variant (of the given media type, encoded with the given content encoding). These are interpreted as URLs relative to the map file; they must be on the same server (!), and they must refer to files to which the client would be granted access if they were to be requested directly. Content-Type: media type — charset, level and «qs» parameters may be given. These are often referred to as MIME types; typical media types are image/gif , text/plain , or text/html; level=3 . Content-Language: The languages of the variant, specified as an Internet standard language tag from RFC 1766 (e.g., en for English, kr for Korean, etc.). Content-Encoding: If the file is compressed, or otherwise encoded, rather than containing the actual raw data, this says how that was done. Apache only recognizes encodings that are defined by an AddEncoding directive. This normally includes the encodings x-compress for compress’d files, and x-gzip for gzip’d files. The x- prefix is ignored for encoding comparisons. Content-Length: The size of the file. Specifying content lengths in the type-map allows the server to compare file sizes without checking the actual files. Description: A human-readable textual description of the variant. If Apache cannot find any appropriate variant to return, it will return an error response which lists all available variants instead. Such a variant list will include the human-readable variant descriptions.

    Multiviews

    MultiViews is a per-directory option, meaning it can be set with an Options directive within a , or section in access.conf , or (if AllowOverride is properly set) in .htaccess files. Note that Options All does not set MultiViews ; you have to ask for it by name.

    The effect of MultiViews is as follows: if the server receives a request for /some/dir/foo , if /some/dir has MultiViews enabled, and /some/dir/foo does not exist, then the server reads the directory looking for files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client’s requirements.

    MultiViews may also apply to searches for the file named by the DirectoryIndex directive, if the server is trying to index a directory. If the configuration files specify

    then the server will arbitrate between index.html and index.html3 if both are present. If neither are present, and index.cgi is there, the server will run it.

    If one of the files found when reading the directive is a CGI script, it’s not obvious what should happen. The code gives that case special treatment — if the request was a POST, or a GET with QUERY_ARGS or PATH_INFO, the script is given an extremely high quality rating, and generally invoked; otherwise it is given an extremely low quality rating, which generally causes one of the other views (if any) to be retrieved.

    The Negotiation Methods

    There are two negotiation methods:

    1. Server driven negotiation with the Apache algorithm is used in the normal case. The Apache algorithm is explained in more detail below. When this algorithm is used, Apache can sometimes ‘fiddle’ the quality factor of a particular dimension to achieve a better result. The ways Apache can fiddle quality factors is explained in more detail below.
    2. Transparent content negotiation is used when the browser specifically requests this through the mechanism defined in RFC 2295. This negotiation method gives the browser full control over deciding on the ‘best’ variant, the result is therefore dependent on the specific algorithms used by the browser. As part of the transparent negotiation process, the browser can ask Apache to run the ‘remote variant selection algorithm’ defined in RFC 2296.


    Dimensions of Negotiation

    Dimension Notes
    Media Type Browser indicates preferences with the Accept header field. Each item can have an associated quality factor. Variant description can also have a quality factor (the «qs» parameter).
    Language Browser indicates preferences with the Accept-Language header field. Each item can have a quality factor. Variants can be associated with none, one or more than one language.
    Encoding Browser indicates preference with the Accept-Encoding header field. Each item can have a quality factor.
    Charset Browser indicates preference with the Accept-Charset header field. Each item can have a quality factor. Variants can indicate a charset as a parameter of the media type.

    Apache Negotiation Algorithm

    Apache can use the following algorithm to select the ‘best’ variant (if any) to return to the browser. This algorithm is not further configurable. It operates as follows:

    1. First, for each dimension of the negotiation, check the appropriate Accept* header field and assign a quality to each variant. If the Accept* header for any dimension implies that this variant is not acceptable, eliminate it. If no variants remain, go to step 4.
    2. Select the ‘best’ variant by a process of elimination. Each of the following tests is applied in order. Any variants not selected at each test are eliminated. After each test, if only one variant remains, select it as the best match and proceed to step 3. If more than one variant remains, move on to the next test.
      1. Multiply the quality factor from the Accept header with the quality-of-source factor for this variant’s media type, and select the variants with the highest value.
      2. Select the variants with the highest language quality factor.
      3. Select the variants with the best language match, using either the order of languages in the Accept-Language header (if present), or else the order of languages in the LanguagePriority directive (if present).
      4. Select the variants with the highest ‘level’ media parameter (used to give the version of text/html media types).
      5. Select variants with the best charset media parameters, as given on the Accept-Charset header line. Charset ISO-8859-1 is acceptable unless explicitly excluded. Variants with a text/* media type but not explicitly associated with a particular charset are assumed to be in ISO-8859-1.
      6. Select those variants which have associated charset media parameters that are not ISO-8859-1. If there are no such variants, select all variants instead.
      7. Select the variants with the best encoding. If there are variants with an encoding that is acceptable to the user-agent, select only these variants. Otherwise if there is a mix of encoded and non-encoded variants, select only the unencoded variants. If either all variants are encoded or all variants are not encoded, select all variants.
      8. Select the variants with the smallest content length.
      9. Select the first variant of those remaining. This will be either the first listed in the type-map file, or when variants are read from the directory, the one whose file name comes first when sorted using ASCII code order.
    3. The algorithm has now selected one ‘best’ variant, so return it as the response. The HTTP response header Vary is set to indicate the dimensions of negotiation (browsers and caches can use this information when caching the resource). End.

      To get here means no variant was selected (because none are acceptable to the browser). Return a 406 status (meaning «No acceptable representation») with a response body consisting of an HTML document listing the available variants. Also set the HTTP Vary header to indicate the dimensions of variance.

      You should be aware that the error message returned by Apache is necessarily rather terse and might confuse some users (even though it lists the available alternatives). If you want to avoid users seeing this error page, you should organize your documents such that a document in a default language (or with a default encoding etc.) is always returned if a document is not available in any of the languages, encodings etc. the browser asked for.

      In particular, if you want a document in a default language to be returned if a document is not available in any of the languages a browser asked for, you should create a document with no language attribute set. See Variants with no Language below for details.

      Fiddling with Quality Values

      Apache sometimes changes the quality values from what would be expected by a strict interpretation of the Apache negotiation algorithm above. This is to get a better result from the algorithm for browsers which do not send full or accurate information. Some of the most popular browsers send Accept header information which would otherwise result in the selection of the wrong variant in many cases. If a browser sends full and correct information these fiddles will not be applied.

      Media Types and Wildcards

      The Accept: request header indicates preferences for media types. It can also include ‘wildcard’ media types, such as «image/*» or «*/*» where the * matches any string. So a request including:

      would indicate that any type starting «image/» is acceptable, as is any other type (so the first «image/*» is redundant). Some browsers routinely send wildcards in addition to explicit types they can handle. For example: The intention of this is to indicate that the explicitly listed types are preferred, but if a different representation is available, that is ok too. However under the basic algorithm, as given above, the */* wildcard has exactly equal preference to all the other types, so they are not being preferred. The browser should really have sent a request with a lower quality (preference) value for *.*, such as: The explicit types have no quality factor, so they default to a preference of 1.0 (the highest). The wildcard */* is given a low preference of 0.01, so other types will only be returned if no variant matches an explicitly listed type.

      If the Accept: header contains no q factors at all, Apache sets the q value of «*/*», if present, to 0.01 to emulate the desired behaviour. It also sets the q value of wildcards of the format «type/*» to 0.02 (so these are preferred over matches against «*/*». If any media type on the Accept: header contains a q factor, these special values are not applied, so requests from browsers which send the correct information to start with work as expected.

      Variants with no Language

      If some of the variants for a particular resource have a language attribute, and some do not, those variants with no language are given a very low language quality factor of 0.001.

      The reason for setting this language quality factor for variant with no language to a very low value is to allow for a default variant which can be supplied if none of the other variants match the browser’s language preferences. This allows you to avoid users seeing a «406» error page if their browser is set to only accept languages which you do not offer for the resource that was requested.

      For example, consider the situation with Multiviews enabled and three variants:

      • foo.en.html, language en
      • foo.fr.html, language fr
      • foo.html, no language

      The meaning of a variant with no language is that it is always acceptable to the browser. If the request is for foo and the Accept-Language header includes either en or fr (or both) one of foo.en.html or foo.fr.html will be returned. If the browser does not list either en or fr as acceptable, foo.html will be returned instead. If the client requests foo.html instead, then no negotiation will occur since the exact match will be returned. To avoid this problem, it is sometimes helpful to name the «no language» variant foo.html.html to assure that Multiviews and language negotiation will come into play.

      Extensions to Transparent Content Negotiation

      If you are using language negotiation you can choose between different naming conventions, because files can have more than one extension, and the order of the extensions is normally irrelevant (see mod_mime documentation for details).

      A typical file has a MIME-type extension (e.g., html ), maybe an encoding extension (e.g., gz ), and of course a language extension (e.g., en ) when we have different language variants of this file.

      Here some more examples of filenames together with valid and invalid hyperlinks:

      Filename Valid hyperlink Invalid hyperlink
      foo.html.en foo
      foo.html
      foo.en.html foo foo.html
      foo.html.en.gz foo
      foo.html
      foo.gz
      foo.html.gz
      foo.en.html.gz foo foo.html
      foo.html.gz
      foo.gz
      foo.gz.html.en foo
      foo.gz
      foo.gz.html
      foo.html
      foo.html.gz.en foo
      foo.html
      foo.html.gz
      foo.gz

      Looking at the table above you will notice that it is always possible to use the name without any extensions in a hyperlink (e.g., foo ). The advantage is that you can hide the actual type of a document rsp. file and can change it later, e.g., from html to shtml or cgi without changing any hyperlink references.

      If you want to continue to use a MIME-type in your hyperlinks (e.g. foo.html ) the language extension (including an encoding extension if there is one) must be on the right hand side of the MIME-type extension (e.g., foo.html.en ).

      Note on Caching

      When a cache stores a representation, it associates it with the request URL. The next time that URL is requested, the cache can use the stored representation. But, if the resource is negotiable at the server, this might result in only the first requested variant being cached and subsequent cache hits might return the wrong response. To prevent this, Apache normally marks all responses that are returned after content negotiation as non-cacheable by HTTP/1.0 clients. Apache also supports the HTTP/1.1 protocol features to allow caching of negotiated responses.

      For requests which come from a HTTP/1.0 compliant client (either a browser or a cache), the directive CacheNegotiatedDocs can be used to allow caching of responses which were subject to negotiation. This directive can be given in the server config or virtual host, and takes no arguments. It has no effect on requests from HTTP/1.1 clients.

      ИТ База знаний

      ShareIT — поделись знаниями!

      Полезно

      Узнать IP — адрес компьютера в интернете

      Онлайн генератор устойчивых паролей

      Онлайн калькулятор подсетей

      Калькулятор инсталляции IP — АТС Asterisk

      Руководство администратора FreePBX на русском языке

      Руководство администратора Cisco UCM/CME на русском языке

      Серверные решения

      Телефония

      FreePBX и Asterisk

      Настройка программных телефонов

      Корпоративные сети

      Похожие статьи

      Протокол H.323

      10 причин: почему IP – телефония это круто

      Сигнализация H.323

      Про Session Description Protocol

      Про Session Description Protocol

      Важная часть SIP

      4 минуты чтения

      Одним из важных компонентов установления соединения по протоколу SIP является протокол Session Description Protocol, или сокращенно SDP.

      О протоколе SDP впервые заговорили в 1998 году в рамках опубликованного RFC2327. Спустя 8 лет, в 2006 году протокол претерпел некоторые изменения, которые были отображены в RFC4566.

      Протокол SDP используется для установления соединения и согласования параметров передачи и приема аудио или видео потоков между оконечными устройствами. Наиболее важными параметрами обмена являются IP – адреса, номера портов и кодеки. Давайте разбираться?

      Пример SDP

      При установлении сессии SDP параметры передаются в рамках SIP – запросов. Давайте взглянем на один из таких запросов. В данном случае распарсим SIP INVITE, который прилетело на нашу IP – АТС Asterisk с помощью утилиты sngrep:

      В приведенном примере можно увидеть, что основная часть SIP – сообщения отделена от SDP сегмента пустой строкой. Помимо прочего, поле Content-Type , что сообщение сопоставимо с SDP параметрами.

      Про SDP поля

      Каждый из параметров SDP сообщения можно отнести к одной из следующих категорий:

      • Имя сессии;
      • Время, в течении которого сессия активна;
      • Параметры медиа;
      • Информация о пропускной способности;
      • Контактная информация;

      Поговорим об основных параметрах. Они всегда имеют следующее обозначение: = . Поле всегда обозначается 1 буквой.

      Поле Значение Формат
      v= версия протокола v=0
      o= инициатор сессии и соответствующие идентификаторы o= .
      В нашем примере поле o=Sippy 1011212504475793896 1 IN IP4 80.xx.yy.zz (IN — тип сети, интернет, IP4 — тип адреса, IPv4;
      s= имя сессии в нашем примере прочерк («-«), имя сессии не указано;
      c= информация о подключении; Синтаксис таков: c= . В нашем примере IN IP4 80.xx.yy.zz . Параметры IN/IP4 объяснены выше.
      t= время активности сессии Синтаксис поля таков: t= . Это обязательное поле, но важно отметить, что оно весьма субъективно, так как невозможно предсказать точное время начала и окончания. В нашем примере t=0 0
      m= тип передачи медиа данных, формат и адресация m= . Давайте разберемся — у нас m=audio 57028 RTP/AVP 0 8 18 101 , это означает передачу аудио (может быть значение video, или передача обоих типов), порт передачи обозначен как 57028, транспорт, указанный как RTP/AVP, означает передачу по протоколу RTP в рамках стандарта Audio and Video Conferences with Minimal Control, который описан в RFC3551. После, первый означает протокол G.711 uLaw, 8 означает G.711 ALaw, 18 означает G.729. То есть условно говоря, нам предложено предпочтение кодеков сначала G.711 uLaw, затем G.711 ALaw, и третьим приоритетом G.729. 101 означает поддержку динамического типа данных, например DTMF.
      a= параметры сессии a= или a= . SDP сессия может содержать несколько дополнительных атрибутов передачи. Более подробно мы рассмотрим далее.

      Помимо указанных параметров, зачастую встречаются такие как k= , в рамках которого описывается метод шифрования, или i= , содержащий дополнительную информацию о сессии. Поговорим про параметры поля a= :

      Параметр Синтаксис и описание
      rtpmap a=rtpmap: / [/ ].
      Данный параметр подсказывает имена кодеков, частоту и прочие параметры кодирования для данных, обозначенных в параметре m= . Например, у нас a=rtpmap:0 PCMU/8000, означает использование G.711 с импульсно — кодовой модуляцией по U — закону с частотой дискретизации 8000 Гц.
      sendrecv a=sendrecv Данный параметр указывает на то, что мы собираемся отправлять и получать медиа — данные. Например, возможно опция отправки (sendonly), только получение (recvonly) и отключения медиа (inactive);
      ptime a=ptime:
      Продолжительность RTP — пакет (в миллисекундах). Условно говоря, какой длительности фрагмент голоса переносит один RTP — пакет;
      fmtp a=fmtp:
      Параметр описывает дополнительные параметры сессии, например, такие как режим подавления тишины (VAD) и прочие;

      Полезна ли Вам эта статья?

      Пожалуйста, расскажите почему?

      Нам жаль, что статья не была полезна для вас :( Пожалуйста, если не затруднит, укажите по какой причине? Мы будем очень благодарны за подробный ответ. Спасибо, что помогаете нам стать лучше!

      Подпишитесь на нашу еженедельную рассылку, и мы будем присылать самые интересные публикации :) Просто оставьте свои данные в форме ниже.

      Content Negotiation

      Apache HTTPD supports content negotiation as described in the HTTP/1.1 specification. It can choose the best representation of a resource based on the browser-supplied preferences for media type, languages, character set and encoding. It also implements a couple of features to give more intelligent handling of requests from browsers that send incomplete negotiation information.

      Content negotiation is prov >mod_negotiation module, which is compiled in by default.

      See also

      About Content Negotiation

      A resource may be available in several different representations. For example, it might be available in different languages or different media types, or a combination. One way of selecting the most appropriate choice is to give the user an index page, and let them select. However it is often possible for the server to choose automatically. This works because browsers can send, as part of each request, information about what representations they prefer. For example, a browser could indicate that it would like to see information in French, if possible, else English will do. Browsers indicate their preferences by headers in the request. To request only French representations, the browser would send

      Note that this preference will only be applied when there is a choice of representations and they vary by language.

      As an example of a more complex request, this browser has been configured to accept French and English, but prefer French, and to accept various media types, preferring HTML over plain text or other text types, and preferring GIF or JPEG over other media types, but also allowing any other media type as a last resort:

      Accept-Language: fr; q=1.0, en; q=0.5
      Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1

      httpd supports ‘server driven’ content negotiation, as defined in the HTTP/1.1 specification. It fully supports the Accept , Accept-Language , Accept-Charset and Accept-Encoding request headers. httpd also supports ‘transparent’ content negotiation, which is an experimental negotiation protocol defined in RFC 2295 and RFC 2296. It does not offer support for ‘feature negotiation’ as defined in these RFCs.

      A resource is a conceptual entity identified by a URI (RFC 2396). An HTTP server like Apache HTTP Server provides access to representations of the resource(s) within its namespace, with each representation in the form of a sequence of bytes with a defined media type, character set, encoding, etc. Each resource may be associated with zero, one, or more than one representation at any given time. If multiple representations are available, the resource is referred to as negotiable and each of its representations is termed a variant. The ways in which the variants for a negotiable resource vary are called the dimensions of negotiation.

      Negotiation in httpd

      In order to negotiate a resource, the server needs to be given information about each of the variants. This is done in one of two ways:

      • Using a type map (i.e., a *.var file) which names the files containing the variants explicitly, or
      • Using a ‘MultiViews’ search, where the server does an implicit filename pattern match and chooses from among the results.

      Using a type-map file

      A type map is a document which is associated with the handler named type-map (or, for backwards-compatibility with older httpd configurations, the MIME-type application/x-type-map ). Note that to use this feature, you must have a handler set in the configuration that defines a file suffix as type-map ; this is best done with

      in the server configuration file.

      Type map files should have the same name as the resource which they are describing, followed by the extension .var . In the examples shown below, the resource is named foo , so the type map file is named foo.var .

      This file should have an entry for each available variant; these entries consist of contiguous HTTP-format header lines. Entries for different variants are separated by blank lines. Blank lines are illegal within an entry. It is conventional to begin a map file with an entry for the combined entity as a whole (although this is not required, and if present will be ignored). An example map file is shown below.

      URIs in this file are relative to the location of the type map file. Usually, these files will be located in the same directory as the type map file, but this is not required. You may provide absolute or relative URIs for any file located on the same server as the map file.

      URI: foo.en.html
      Content-type: text/html
      Content-language: en


      URI: foo.fr.de.html
      Content-type: text/html;charset=iso-8859-2
      Content-language: fr, de

      Note also that a typemap file will take precedence over the filename’s extension, even when Multiviews is on. If the variants have different source qualities, that may be indicated by the «qs» parameter to the media type, as in this picture (available as JPEG, GIF, or ASCII-art):

      URI: foo.jpeg
      Content-type: image/jpeg; qs=0.8

      URI: foo.gif
      Content-type: image/gif; qs=0.5

      URI: foo.txt
      Content-type: text/plain; qs=0.01

      qs values can vary in the range 0.000 to 1.000. Note that any variant with a qs value of 0.000 will never be chosen. Variants with no ‘qs’ parameter value are given a qs factor of 1.0. The qs parameter indicates the relative ‘quality’ of this variant compared to the other available variants, independent of the client’s capabilities. For example, a JPEG file is usually of higher source quality than an ASCII file if it is attempting to represent a photograph. However, if the resource being represented is an original ASCII art, then an ASCII representation would have a higher source quality than a JPEG representation. A qs value is therefore specific to a given variant depending on the nature of the resource it represents.

      The full list of headers recognized is available in the mod_negotiation typemap documentation.

      Multiviews

      MultiViews is a per-directory option, meaning it can be set with an Options directive within a , or section in apache2.conf , or (if AllowOverride is properly set) in .htaccess files. Note that Options All does not set MultiViews ; you have to ask for it by name.

      The effect of MultiViews is as follows: if the server receives a request for /some/dir/foo , if /some/dir has MultiViews enabled, and /some/dir/foo does not exist, then the server reads the directory looking for files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client’s requirements.

      MultiViews may also apply to searches for the file named by the DirectoryIndex directive, if the server is trying to index a directory. If the configuration files specify

      then the server will arbitrate between index.html and index.html3 if both are present. If neither are present, and index.cgi is there, the server will run it.

      If one of the files found when reading the directory does not have an extension recognized by mod_mime to designate its Charset, Content-Type, Language, or Encoding, then the result depends on the setting of the MultiViewsMatch directive. This directive determines whether handlers, filters, and other extension types can participate in MultiViews negotiation.

      The Negotiation Methods

      After httpd has obtained a list of the variants for a given resource, either from a type-map file or from the filenames in the directory, it invokes one of two methods to decide on the ‘best’ variant to return, if any. It is not necessary to know any of the details of how negotiation actually takes place in order to use httpd’s content negotiation features. However the rest of this document explains the methods used for those interested.

      There are two negotiation methods:

      1. Server driven negotiation with the httpd algorithm is used in the normal case. The httpd algorithm is explained in more detail below. When this algorithm is used, httpd can sometimes ‘fiddle’ the quality factor of a particular dimension to achieve a better result. The ways httpd can fiddle quality factors is explained in more detail below.
      2. Transparent content negotiation is used when the browser specifically requests this through the mechanism defined in RFC 2295. This negotiation method gives the browser full control over deciding on the ‘best’ variant, the result is therefore dependent on the specific algorithms used by the browser. As part of the transparent negotiation process, the browser can ask httpd to run the ‘remote variant selection algorithm’ defined in RFC 2296.

      Dimensions of Negotiation

      Dimension Notes
      Media Type Browser indicates preferences with the Accept header field. Each item can have an associated quality factor. Variant description can also have a quality factor (the «qs» parameter).
      Language Browser indicates preferences with the Accept-Language header field. Each item can have a quality factor. Variants can be associated with none, one or more than one language.
      Encoding Browser indicates preference with the Accept-Encoding header field. Each item can have a quality factor.
      Charset Browser indicates preference with the Accept-Charset header field. Each item can have a quality factor. Variants can indicate a charset as a parameter of the media type.

      httpd Negotiation Algorithm

      httpd can use the following algorithm to select the ‘best’ variant (if any) to return to the browser. This algorithm is not further configurable. It operates as follows:

      1. First, for each dimension of the negotiation, check the appropriate Accept* header field and assign a quality to each variant. If the Accept* header for any dimension implies that this variant is not acceptable, eliminate it. If no variants remain, go to step 4.
      2. Select the ‘best’ variant by a process of elimination. Each of the following tests is applied in order. Any variants not selected at each test are eliminated. After each test, if only one variant remains, select it as the best match and proceed to step 3. If more than one variant remains, move on to the next test.
        1. Multiply the quality factor from the Accept header with the quality-of-source factor for this variants media type, and select the variants with the highest value.
        2. Select the variants with the highest language quality factor.
        3. Select the variants with the best language match, using either the order of languages in the Accept-Language header (if present), or else the order of languages in the LanguagePriority directive (if present).
        4. Select the variants with the highest ‘level’ media parameter (used to give the version of text/html media types).
        5. Select variants with the best charset media parameters, as given on the Accept-Charset header line. Charset ISO-8859-1 is acceptable unless explicitly excluded. Variants with a text/* media type but not explicitly associated with a particular charset are assumed to be in ISO-8859-1.
        6. Select those variants which have associated charset media parameters that are not ISO-8859-1. If there are no such variants, select all variants instead.
        7. Select the variants with the best encoding. If there are variants with an encoding that is acceptable to the user-agent, select only these variants. Otherwise if there is a mix of encoded and non-encoded variants, select only the unencoded variants. If either all variants are encoded or all variants are not encoded, select all variants.
        8. Select the variants with the smallest content length.
        9. Select the first variant of those remaining. This will be either the first listed in the type-map file, or when variants are read from the directory, the one whose file name comes first when sorted using ASCII code order.
      3. The algorithm has now selected one ‘best’ variant, so return it as the response. The HTTP response header Vary is set to indicate the dimensions of negotiation (browsers and caches can use this information when caching the resource). End.
      4. To get here means no variant was selected (because none are acceptable to the browser). Return a 406 status (meaning «No acceptable representation») with a response body consisting of an HTML document listing the available variants. Also set the HTTP Vary header to indicate the dimensions of variance.

      Fiddling with Quality Values

      httpd sometimes changes the quality values from what would be expected by a strict interpretation of the httpd negotiation algorithm above. This is to get a better result from the algorithm for browsers which do not send full or accurate information. Some of the most popular browsers send Accept header information which would otherwise result in the selection of the wrong variant in many cases. If a browser sends full and correct information these fiddles will not be applied.

      Media Types and Wildcards

      The Accept: request header indicates preferences for media types. It can also include ‘wildcard’ media types, such as «image/*» or «*/*» where the * matches any string. So a request including:

      would indicate that any type starting «image/» is acceptable, as is any other type. Some browsers routinely send wildcards in addition to explicit types they can handle. For example:

      Accept: text/html, text/plain, image/gif, image/jpeg, */*

      The intention of this is to indicate that the explicitly listed types are preferred, but if a different representation is available, that is ok too. Using explicit quality values, what the browser really wants is something like:

      Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01

      The explicit types have no quality factor, so they default to a preference of 1.0 (the highest). The wildcard */* is given a low preference of 0.01, so other types will only be returned if no variant matches an explicitly listed type.

      If the Accept: header contains no q factors at all, httpd sets the q value of «*/*», if present, to 0.01 to emulate the desired behavior. It also sets the q value of wildcards of the format «type/*» to 0.02 (so these are preferred over matches against «*/*». If any media type on the Accept: header contains a q factor, these special values are not applied, so requests from browsers which send the explicit information to start with work as expected.

      Language Negotiation Exceptions

      New in httpd 2.0, some exceptions have been added to the negotiation algorithm to allow graceful fallback when language negotiation fails to find a match.

      When a client requests a page on your server, but the server cannot find a single page that matches the Accept-language sent by the browser, the server will return either a «No Acceptable Variant» or «Multiple Choices» response to the client. To avoid these error messages, it is possible to configure httpd to ignore the Accept-language in these cases and prov >ForceLanguagePriority directive can be used to overr >LanguagePriority directive.

      The server will also attempt to match language-subsets when no other match can be found. For example, if a client requests documents with the language en-GB for British English, the server is not normally allowed by the HTTP/1.1 standard to match that against a document that is marked as simply en . (Note that it is almost surely a configuration error to include en-GB and not en in the Accept-Language header, since it is very unlikely that a reader understands British English, but doesn’t understand English in general. Unfortunately, many current clients have default configurations that resemble this.) However, if no other language match is possible and the server is about to return a «No Acceptable Variants» error or fallback to the LanguagePriority , the server will ignore the subset specification and match en-GB against en documents. Implicitly, httpd will add the parent language to the client’s acceptable language list with a very low quality value. But note that if the client requests «en-GB; q=0.9, fr; q=0.8», and the server has documents designated «en» and «fr», then the «fr» document will be returned. This is necessary to maintain compliance with the HTTP/1.1 specification and to work effectively with properly configured clients.

      In order to support advanced techniques (such as cookies or special URL-paths) to determine the user’s preferred language, since httpd 2.0.47 mod_negotiation recognizes the environment variable prefer-language . If it exists and contains an appropriate language tag, mod_negotiation will try to select a matching variant. If there’s no such variant, the normal negotiation process applies.

      Example

      Extensions to Transparent Content Negotiation

      httpd extends the transparent content negotiation protocol (RFC 2295) as follows. A new element is used in variant lists to label variants which are available with a specific content-encoding only. The implementation of the RVSA/1.0 algorithm (RFC 2296) is extended to recognize encoded variants in the list, and to use them as candidate variants whenever their encodings are acceptable according to the Accept-Encoding request header. The RVSA/1.0 implementation does not round computed quality factors to 5 decimal places before choosing the best variant.

      If you are using language negotiation you can choose between different naming conventions, because files can have more than one extension, and the order of the extensions is normally irrelevant (see the mod_mime documentation for details).

      A typical file has a MIME-type extension (e.g., html ), maybe an encoding extension (e.g., gz ), and of course a language extension (e.g., en ) when we have different language variants of this file.

      Here some more examples of filenames together with valid and invalid hyperlinks:

      Filename Valid hyperlink Invalid hyperlink
      foo.html.en foo
      foo.html
      foo.en.html foo foo.html
      foo.html.en.gz foo
      foo.html
      foo.gz
      foo.html.gz
      foo.en.html.gz foo foo.html
      foo.html.gz
      foo.gz
      foo.gz.html.en foo
      foo.gz
      foo.gz.html
      foo.html
      foo.html.gz.en foo
      foo.html
      foo.html.gz
      foo.gz

      Looking at the table above, you will notice that it is always possible to use the name without any extensions in a hyperlink (e.g., foo ). The advantage is that you can hide the actual type of a document rsp. file and can change it later, e.g., from html to shtml or cgi without changing any hyperlink references.

      If you want to continue to use a MIME-type in your hyperlinks (e.g. foo.html ) the language extension (including an encoding extension if there is one) must be on the right hand side of the MIME-type extension (e.g., foo.html.en ).

      Note on Caching

      When a cache stores a representation, it associates it with the request URL. The next time that URL is requested, the cache can use the stored representation. But, if the resource is negotiable at the server, this might result in only the first requested variant being cached and subsequent cache hits might return the wrong response. To prevent this, httpd normally marks all responses that are returned after content negotiation as non-cacheable by HTTP/1.0 clients. httpd also supports the HTTP/1.1 protocol features to allow caching of negotiated responses.

      For requests which come from a HTTP/1.0 compliant client (either a browser or a cache), the directive CacheNegotiatedDocs can be used to allow caching of responses which were subject to negotiation. This directive can be given in the server config or virtual host, and takes no arguments. It has no effect on requests from HTTP/1.1 clients.

      For HTTP/1.1 clients, httpd sends a Vary HTTP response header to indicate the negotiation dimensions for the response. Caches can use this information to determine whether a subsequent request can be served from the local copy. To encourage a cache to use the local copy regardless of the negotiation dimensions, set the force-no-vary environment variable.

      Comments

      Copyright 2020 The Apache Software Foundation.
      Licensed under the Apache License, Version 2.0.

      Content Negotiation using Spring MVC

      There are two ways to generate output using Spring MVC:

      • You can use the RESTful @ResponseBody approach and HTTP message converters, typically to return data-formats like JSON or XML. Programmatic clients, mobile apps and AJAX enabled browsers are the usual clients.
      • Alternatively you may use view resolution. Although views are perfectly capable of generating JSON and XML if you wish (more on that in my next post), views are normally used to generate presentation formats like HTML for a traditional web-application.
      • Actually there is a third possibility — some applications require both, and Spring MVC supports such combinations easily. We will come back to that right at the end.

      In either case you’ll need to deal with multiple representations (or views) of the same data returned by the controller. Working out which data format to return is called Content Negotiation.

      There are three situations where we need to know what type of data-format to send in the HTTP response:

      • HttpMessageConverters: Determine the right converter to use.
      • Request Mappings: Map an incoming HTTP request to different methods that return different formats.
      • View Resolution: Pick the right view to use.

      Determining what format the user has requested relies on a ContentNegotationStrategy . There are default implementations available out of the box, but you can also implement your own if you wish.

      In this post I want to discuss how to configure and use content negotiation with Spring, mostly in terms of RESTful Controllers using HTTP message converters. In a later post I will show how to setup content negotiation specifically for use with views using Spring’s ContentNegotiatingViewResolver .

      How does Content Negotiation Work?

      So, for those situations where the Accept header property is not desirable, Spring offers some conventions to use instead. (This was one of the nice changes in Spring 3.2 making a flexible content selection strategy available across all of Spring MVC not just when using views). You can configure a content negotiation strategy centrally once and it will apply wherever different formats (media types) need to be determined.

      Enabling Content Negotiation in Spring MVC

      Spring supports a couple of conventions for selecting the format required: URL suffixes and/or a URL parameter. These work alongs >Accept headers. As a result, the content-type can be requested in any of three ways. By default they are checked in this order:

      • Add a path extension (suffix) in the URL. So, if the incoming URL is something like http://myserver/myapp/accounts/list.html then HTML is required. For a spreadsheet the URL should be http://myserver/myapp/accounts/list.xls . The suffix to media-type mapping is automatically defined via the JavaBeans Activation Framework or JAF (so activation.jar must be on the >http://myserver/myapp/accounts/list?format=xls . The name of the parameter is format by default, but this may be changed. Using a parameter is disabled by default, but when enabled, it is checked second.
      • Finally the Accept HTTP header property is checked. This is how HTTP is actually defined to work, but, as previously mentioned, it can be problematic to use.

      The Java Configuration to set this up, looks like this. Simply customize the predefined content negotiation manager via its configurer. Note the MediaType helper class has predefined constants for most well-known media-types.

      When using XML configuration, the content negotiation strategy is most easily setup via the ContentNegotiationManagerFactoryBean :

      The ContentNegotiationManager created by either setup is an implementation of ContentNegotationStrategy that implements the PPA Strategy (path extension, then parameter, then Accept header) described above.

      Additional Configuration Options

      In Java configuration, the strategy can be fully customized using methods on the configurer:

      In XML, the strategy can be configured using methods on the factory bean:

      What we did, in both cases:

      • Disabled path extension. Note that favor does not mean use one approach in preference to another, it just enables or disables it. The order of checking is always path extension, parameter, Accept header.
      • Enable the use of the URL parameter but instead of using the default parameter, format , we will use mediaType instead.
      • Ignore the Accept header completely. This is often the best approach if most of your clients are actually web-browsers (typically making REST calls via AJAX).
      • Don’t use the JAF, instead specify the media type mappings manually — we only wish to support JSON and XML.

      Listing User Accounts Example

      To return a list of accounts in JSON or XML, I need a Controller like this. We will ignore the HTML generating methods for now.

      Here is the content-negotiation strategy setup:

      Or, using Java Configuration, the code looks like this:

      Prov >HttpMessageConverters . My domain >Account class is shown below.

      Here is the JSON output from our Accounts application (note path-extension in URL).

      How does the system know whether to convert to XML or JSON? Because of content negotiation — any one of the three (PPA Strategy) options discussed above will be used depending on how the ContentNegotiationManager is configured. In this case the URL ends in accounts.json because the path-extension is the only strategy enabled.

      In the sample code you can switch between XML or Java Configuration of MVC by setting an active profile in the web.xml . The profiles are “xml” and “javaconfig” respectively.

      Combining Data and Presentation Formats

      Spring MVC’s REST support builds on the existing MVC Controller framework. So it is possible to have the same web-applications return information both as raw data (like JSON) and using a presentation format (like HTML).

      Both techniques can easily be used side by side in the same controller, like this:

      There is a simple Pattern here: the @ResponseBody method handles all data access and integration with the underlying service layer (the AccountManager ). The second method calls the first and sets up the response in the Model for use by a View. This avoids duplicated logic.

      To determine which of the two @RequestMapping methods to pick, we are again using our PPA content negotiation strategy. It allows the produces option to work. URLs ending with accounts.xml or accounts.json map to the first method, any other URLs ending in accounts.anything map to the second.

      Another Approach

      Alternatively we could do the whole thing with just one method if we used views to generate all possible content-types. This is where the ContentNegotiatingViewResolver comes in and that will be the subject of my next post.

      Acknoweldgements

      I would like to thank Rossen Stoyanchev for his help in writing this post. Any errors are my own.

      Addendum: The Annotated Account Class

      Added 2 June 2013.

      Since there were some questions on how to annotate a class for JAXB, here is part of the Account class. For brevity I have omitted the data-members, and all methods except the annotated getters. I could annotate the data-members directly if preferred (just like JPA annotations in fact). Remember that Jackson can marshal objects to JSON using these same annotations.

      Илон Маск рекомендует:  Заголовок группы элементов формы legend
Понравилась статья? Поделиться с друзьями:
Кодинг, CSS и SQL