Note
Notes highlighted in yellow are comments on the proposed wording and are not intended for the actual TS.
The scope of this Technical Specification will include a single std::experimental::uri type, specifications about how the are intended to be processed and extended, including some additional helper types and functions. It will include a std::experimental::uri_builder type to build a URI from its components. It will also include a std::experimental::uri_errc enum which enumerates the types of error that can occur when parsing, normalizing, resolving or building a URI, or percent decoding. Finally, it will include types and functions for percent encoding and decoding, URI references, reference resolution and URI normalization and comparison.
The generic syntax of a URI is defined in IETF RFC 3986. section 3.
All URIs are of the form:
scheme ":" "hierarchical part" [ "?" query ] [ "#" fragment ]
The scheme is used to identify the specification needed to parse the rest of the URI. A generic syntax parser can parse any URI into its main parts. The scheme can then be used to identify whether further scheme-specific parsing can be performed.
The hierarchical part refers to the part of the URI that holds identification information that is hierarchical in nature. This may contain an authority (always prefixed with a double slash character (“//”)) and/or a path. The path part is required, thought it may be empty. The authority part holds an optional user info part, ending with an at character (“@”); a host identifier and an optional port number, preceded by a colon character(”:”). The host may be an IP address or domain name. The normative reference for IPv6 addresses is IETF RFC 3986.
The query is an optional part following a question mark character (”?”) that contains information that is not hierarchical.
Finally, the fragment is an optional part, prefixed by a hash character (“#”) that is used to identify secondary sources.
IETF RFC 3987 specifies a new protocol element, the Internationalized Resource Identifier (IRI). The IRI complements a URI, and extends it to allow unicode characters. The syntax of an IRI is specified in IETF RFC 3987, section 2.
IETF RFC 6874 specifies scoped IDs in IPv6 addresses. The syntax is specified in IETF RFC 6874, section 2.
The rules for URI normalization are specified in IETF RFC 3986, section 6 and IETF RFC 3987, section 5.
The rule for transforming references is given in IETF RFC 3986, section 5.2.2.
The rule for removing dot segments is given in IETF RFC 3986, section 5.2.4.
The rule for recomposing a URI from its parts is given in IETF RFC 3986, section 5.3.
A Uniform Resource Identifier is a sequence of characters from a limited set with a specific syntax used to identify a name or resource. URIs can be classified as URLs or URNs. The URI syntax is defined in IETF RFC 3986.
A Uniform Resource Locator (URL) is a type of URI, complementary to a URN used to locate a resource over a network.
A Uniform Resource Name (URN) is a type of URI, complementary to a URL used to unambiguously identify resources.
An Internationalized Resource Identifier (IRI) is a complement to the URI that allows characters from the Universal Character Set (Unicode/ISO 10646). The IRI syntax is defined in IETF RFC 3987.
A generic URI is decomposed into four principal parts: the scheme, the hierarchical part, an optional query and optional fragment. The hierarchical part can be further decomposed into four parts: the user info, host, port and path.
A scheme name is the top level of the URI naming structure. It indicates the specifications, syntax and semantics of the rest of the URI structure. It is always followed by a colon character (”:”). The scheme syntax is defined in IETF RFC 3986, section 3.1.
A query is a part, indicated by a question mark character (”?”) and terminated by a hash character (“#”), that contains non-hierarchical information. It is commonly structured as a sequence of key-value parameter values separated by equals (“=”), which are separated by a semi-colon character (”;”) or ampersand character (“&”). The query syntax is defined in IETF RFC 3986, section 3.4.
A fragment is indicated by a hash character (“#”) and allows indirect identification of a secondary resource. For example, a fragment may refer to a section header in an HTML document with an id attribute of the same name. The fragment syntax is defined in IETF RFC 3986, section 3.5.
The hierarchical part of a URI contains hierarchical information. If it starts with two consecutive forward slash characters (“//”), it is followed by an authority and a path. The authority can be further broken down into a user-information part, a hostname and a port. The authority is followed by an optional path. If the hierarchical part does not begin with two consecutive forward slash characters (“//”), then it contains a path.
The hierarchical part contains an authority. The authority contains an optional user info followed by an at character (“@”), a host and an optional port, preceded by a colon character (”:”). The authority syntax is defined in IETF RFC 3986, section 3.2.
The user info is an optional part of the URI authority, terminated by an at character (“@”) and is followed by a host. It is used in the telnet scheme:
telnet://<user>:<password>@<host>:<port>/
The user info syntax is defined in IETF RFC 3986, section 3.2.1.
The hostname contains a domain name or IP address. The host syntax is defined in IETF RFC 3986, section 3.2.2.
A domain name is human-readable string used to identify a host. Domain names are registered in the Domain Name System (DNS).
The IP address can either be an IPv4 (e.g. 127.0.0.1) or an IPv6 address (e.g. ::1. In a URI, an IPv6 address is enclosed in square brace characters (“[]”).
The optional port is a non-negative integer value, always preceded by a colon character (”:”). If the port is not present, even if a colon is present, then the port is considered to have the value of the default port of the scheme. The port syntax is defined in IETF RFC 3986, section 3.2.3.
The path is a part of the hierarchical data and is a sequence of segments, each separated by a forward slash character (“/”). It is terminated by a question mark character (”?”), followed by a query, a hash character (“#”) followed by a fragment or by the end of the URI. The path syntax is defined in IETF RFC 3986, section 3.3.
Dot segments are elements in a path containing either a dot character (”.”) or two consecutive dot characters (”..”), separated by a forward slash character (“/”). Dot segments can be removed from a path as part of its normalization without changing the URI semantics.
An absolute URI always specifies the scheme. URIs that don’t provide the scheme are called relative references.
An opaque URI is an absolute URI that does not provide two consecutive forward slash characters (“//”) after the scheme-delimiting colon character (”:”). Opaque URIs have no authority and the part immediately following the colon character (”:”) is the path. Some examples of opaque URIs are:
mailto:[email protected]
news:comp.lang.c++
URIs that provide two consecutive forward slash characters (“//”) following the scheme-delimiting colon character (”:”) are known as hierarchical URIs. Some examples are:
http://www.example.com/
ftp://[email protected]/
URI normalization is the process by which a URI is transformed in order to determine of two URIs are equivalent. There are different levels to comparison, which trade-off the number of false negatives and complexity. The normalization and comparison procedures are defined in IETF RFC 3986, section 6.
The comparison ladder describes how URIs can be compared using normalization in different ways, trading off the complexity of the method and the number of false negatives. The comparison ladder is defined in IETF RFC 3986, section 6.2 and IETF RFC 3987, section 5.3.
A base URI is required to be established in order for relative references to be usable. The specification for establishing a base URI is defined in IETF RFC 3986, section 5.1. The base URI can also be used as the basis for building URIs.
Relative references are URIs that do not provide a scheme. Relative references are only usable when a base URI is known, against which the relative reference can be resolved. The relative reference is defined in IETF RFC 3986, section 4.2 and IETF RFC 3987, section 6.5.
Relative references can be resolved against a base URI, producing an absolute URI. Only the scheme is required to be present in the base URI. Reference resolution is defined in IETF RFC 3986, section 5.
Pre-parsing and normalization of the URI is performed before transforming the reference.
The transform reference for resolving URIs is given in IETF RFC 3986, section 5.2.2.
Percent encoding is the mechanism used to encode reserved characters in a URI. See IETF RFC 3986, section 2.1.
All characters in a URI scheme and host are lower-case. All hexidecimal digits within a percent-encoded triplet are upper-case. See IETF RFC 3986, section 6.2.2.1 and IETF RFC 3987, section 5.3.2.1.
URIs should be normalized by decoding any percent-encoded octet that corresponds to a an unreserved character. See IETF RFC 3986, section 6.2.2.2 and IETF RFC 3987, section 5.3.2.3.
Path segments [uri.definition.dot-segments] should be removed from URIs that are not relative references. See IETF RFC 3986, section 6.2.2.3 and IETF RFC 3987, section 5.3.2.4.
In Unicode, different sequences of characters could be defined as equivalent depending on how they are encoded. See IETF RFC 3987, section 5.3.2.2.
A zone index is used to identify to which scope a non-global address belongs in an IPv6 address. It is specified in IETF RFC 6874.
Throughout this Technical Specification, char, wchar_t, char16_t, and char32_t are collectively called encoded character types.
Template parameters named charT shall be one of the encoded character types.
Template parameters named InputIterator shall meet the C++ Standard’s library input iterator requirements ([input.iterators] C++11 section 24.2.3) and shall have a value type that is one of the encoded character types.
[Note: Use of an encoded character type implies an associated encoding. Since signed char and unsigned char have no implied encoding, they are not included as permitted types. –end note]
Note
The above text uses similar wording to N3940 [fs.req] section 5.
Function template parameters named Source, UserInfoSource, HostSource, PortSource, KeySource, ParamSource shall be one of:
Arguments of type Source, UserInfoSource, HostSource, PortSource, KeySource, ParamSource shall not be null pointers.
Note
This is similar wording to the filesystem path requirements in N3940 ([path.req]).
The uri class shall parse according to the rules described in IETF RFC 3986, Section 3.
The uri class shall correctly parse IPv6 addresses, described in IETF RFC 3986.
The uri class shall parse internationalized URIs according to IETF RFC 3987, section 2.
The uri class shall parse zone IDs in IPv6 addresses according to IETF RFC 6874, section 2.
Template parameters named Allocator shall meet the C++ Standard’s library Allocator requirements (C++11 17.6.3.5)
#include <string> // std::basic_string
#include <system_error> // std::error_code
#include <iosfwd> // std::basic_istream, std::basic_ostream
#include <iterator> // std::iterator_traits
#include <memory> // std::allocator
#include <experimental/optional> // std::experimental::optional
#include <experimental/string_view> // std::experimental::basic_string_view
#include <experimental/filesystem> // std::experimental::filesystem::path
namespace std {
namespace experimental {
// class declarations
class uri;
class uri_builder;
class uri_error;
enum class uri_errc {
// uri syntax errors
invalid_syntax = 1,
// uri reference and resolution errors
base_uri_is_empty,
base_uri_is_not_absolute,
base_uri_is_opaque,
base_uri_does_not_match,
// builder errors
invalid_uri,
invalid_scheme,
invalid_user_info,
invalid_host,
invalid_port,
invalid_path,
invalid_query,
invalid_fragment,
// decoding errors
not_enough_input,
non_hex_input,
conversion_failed,
};
enum class uri_normalization_level {
string_comparison,
syntax_based,
};
// factory functions
template <class Source>
uri make_uri(const Source& source, error_code& ec);
template <class InputIterator>
uri make_uri(InputIterator first, InputIterator last, error_code& ec);
template <class Source, class Allocator>
uri make_uri(const Source& source, const Allocator& a, error_code& ec);
template <class InputIterator, class Allocator>
uri make_uri(InputIterator first, InputIterator last, const Allocator& a, error_code& ec);
// interoperability with filesystem::path
uri to_uri(const experimental::filesystem::path& p);
template <class Allocator>
uri to_uri(const experimental::filesystem::path& p, const Allocator& a);
experimental::filesystem::path to_filesystem_path(const uri& u);
experimental::filesystem::path to_filesystem_path(const uri& u, error_code& ec);
// equality and comparison operators
constexpr bool operator== (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator!= (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator< (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator> (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator<= (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator>= (const uri& lhs, const uri& rhs) noexcept;
// stream operators
template <typename charT, class traits>
basic_ostream<charT, traits>&
operator<< (basic_ostream<charT, traits>& os, const uri& u);
template <typename charT, class traits>
basic_istream<charT, traits>&
operator>> (basic_istream<charT, traits>& is, uri& u);
// swap functions
void swap(uri& lhs, uri& rhs) noexcept;
} // namespace experimental
template <class T> struct hash;
template <> struct hash<experimental::uri>;
namespace experimental {
// error handling
error_code make_error_code(uri_errc e) noexcept;
error_condition make_error_condition(uri_errc e) noexcept;
const error_category& uri_category() noexcept;
} // namespace experimental
template <>
struct is_error_code_enum<experimental::uri_errc> : public true_type { };
} // namespace std
The <experimental/uri> header contains a declaration for a uri class, a uri_builder class and an exception class, uri_error in the std::experimental namespace.
// factory functions
template <class Source>
uri make_uri(const Source& source, error_code& ec);
template <class InputIterator>
uri make_uri(InputIterator first, InputIterator last, error_code& ec);
template <class Source, class Allocator>
uri make_uri(const Source& source, const Allocator& a, error_code& ec);
template <class InputIterator, class Allocator>
uri make_uri(InputIterator first, InputIterator last, const Allocator& a,
error_code& ec);
uri to_uri(const experimental::filesystem::path& p);
// On UNIX
std::experimental::filesystem::path p("/home/example/.bashrc");
std::experimental::uri u(std::experimental::to_uri(p));
assert(u.is_absolute());
assert("file:///home/example/.bashrc" == u.string());
assert("/home/example/.bashrc" == *u.path());
std::experimental::filesystem::path p(".bashrc");
std::experimental::uri u(std::experimental::to_uri(p));
assert(u.is_relative());
assert(".bashrc" == u.string());
assert(".bashrc" == *u.path());
template <class Allocator>
uri to_uri(const experimental::filesystem::path& p, const Allocator& a);
std::experimental::filesystem::path p("/home/example/.bashrc");
my_ns::my_allocator a;
std::experimental::uri u(std::experimental::to_uri(p, a));
assert("file:///home/example/.bashrc" == u.string());
experimental::filesystem::path to_filesystem_path(const uri& u);
try {
std::experimental::uri u("file:///home/example/.bashrc");
std::experimental::filesystem::path p(std::experimental::to_filesystem_path(u));
assert(p.is_absolute());
assert("/home/example/.bashrc" == p.string());
}
catch (const std::experimental::uri_error& e) {
// handle error
}
try {
std::experimental::uri u("file:///home/example/");
std::experimental::uri v("file:///home/example/.bashrc");
std::experimental::uri w = u.make_relative(v);
std::experimental::filesystem::path p(std::experimental::to_filesystem_path(w));
assert(p.is_relative());
assert(".bashrc" == p.string());
}
catch (const std::experimental::uri_error& e) {
// handle error
}
experimental::filesystem::path to_filesystem_path(const uri& u, error_code& ec);
std::experimental::uri u("file:///home/example/.bashrc");
std::error_code ec;
std::experimental::filesystem::path p(std::experimental::to_filesystem_path(u, ec));
if (!ec) {
assert(p.is_absolute());
assert("/home/example/.bashrc" == p.string());
}
else {
// handle error
}
constexpr bool operator== (const uri& lhs, const uri& rhs) noexcept;
[Note: The equality operator uses a character-by-character comparison of two uri objects. –end note]
constexpr bool operator!= (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator< (const uri& lhs, const uri& rhs) noexcept;
[Note: The less-than operator uses a character-by-character comparison of two uri objects. –end note]
constexpr bool operator> (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator<= (const uri& lhs, const uri& rhs) noexcept;
constexpr bool operator>= (const uri& lhs, const uri& rhs) noexcept;
template <typename charT, class traits>
basic_ostream<charT, traits>&
operator<< (basic_ostream<charT, traits>& os, const uri& u);
[Note: u is percent encoded according to [uri.definition.percent-encoding]]
template <typename charT, class traits>
basic_istream<charT, traits>&
operator>> (basic_istream<charT, traits>& is, uri& u);
void swap(uri& lhs, uri& rhs) noexcept;
template <> struct hash<experimental::uri>;
Some URI functions provide two overloads, one that throws an exception to report errors, and a second that sets an std::error_code.
Member functions of uri not having an error of type std::error_code& report errors as follow, unless otherwise specified:
Functions that have an error of type std::error_code& report errors as follows:
error_code make_error_code(uri_errc e) noexcept;
error_condition make_error_condition(uri_errc e) noexcept;
const error_category& uri_category() noexcept;
namespace std { namespace experimental { class uri { public: // typedefs typedef unspecified value_type; typedef implementation-defined iterator; typedef implementation-defined const_iterator; typedef experimental::basic_string_view<value_type> string_view; // constructors and destructor uri(); template <class Source, class Allocator = allocator<value_type>> explicit uri(const Source& source, const Allocator& a = Allocator()); template <class InputIterator, class Allocator = allocator<value_type>> uri(InputIterator begin, InputIterator end, const Allocator& a = Allocator()); uri(const uri& other); uri(uri&& other) noexcept; ~uri(); // assignment uri& operator= (const uri& other); uri& operator= (uri&& other) noexcept; // modifiers void swap(uri& other) noexcept; // iterators constexpr const_iterator begin() const noexcept; constexpr const_iterator end() const noexcept; constexpr const_iterator cbegin() const noexcept; constexpr const_iterator cend() const noexcept; // accessors constexpr experimental::optional<string_view> scheme() const; constexpr experimental::optional<string_view> user_info() const; constexpr experimental::optional<string_view> host() const; constexpr experimental::optional<string_view> port() const; template <class intT> constexpr experimental::optional<intT> port() const; constexpr experimental::optional<string_view> path() const; constexpr experimental::optional<string_view> authority() const; constexpr experimental::optional<string_view> query() const; constexpr experimental::optional<string_view> fragment() const; // string accessors template <class charT, class traits = char_traits<charT>, class Allocator = allocator<charT>> basic_string<charT, traits, Allocator> to_string(const Allocator& a = Allocator()) const; string string() const; wstring wstring() const; string u8string() const; u16string u16string() const; u32string u32string() const; // query constexpr bool empty() const noexcept; constexpr bool is_absolute() const noexcept; constexpr bool is_relative() const noexcept; constexpr bool is_hierarchical() const noexcept; constexpr bool is_opaque() const noexcept; // transformers uri normalize(uri_normalization_level level) const; template <class Allocator> uri normalize(uri_normalization_level level, const Allocator& alloc) const; uri normalize(uri_normalization_level level, error_code& ec) const; template <class Allocator> uri normalize(uri_normalization_level level, const Allocator& alloc, error_code& ec) const; uri make_relative(const uri& u) const; template <class Allocator> uri make_relative(const uri& u, const Allocator& a) const; uri make_relative(const uri& u, error_code& ec) const; template <class Allocator> uri make_relative(const uri& u, const Allocator& a, error_code& ec) const; uri resolve(const uri& u) const; template <class Allocator> uri resolve(const uri& u, const Allocator& a) const; uri resolve(const uri& u, error_code& ec) const; template <class Allocator> uri resolve(const uri& u, const Allocator& a, error_code& ec) const; // comparison constexpr int compare(const uri& other, uri_normalization_level level) const noexcept; // percent encoding and decoding template <class InputIterator, class OutputIterator> static OutputIterator encode_user_info(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator encode_host(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator encode_port(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator encode_path(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator encode_query(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator encode_fragment(InputIterator begin, InputIterator end, OutputIterator out); template <class InputIterator, class OutputIterator> static OutputIterator decode(InputIterator begin, InputIterator end, OutputIterator out); }; } // namespace experimental } // namespace std
For member functions returning strings, value type and encoding conversion is performed if the value type of the argument or return differs from uri::value_type. Encoding and method of conversion for the argument or return value to be converted to is determined by its value type:
Note
This is based on wording in the filesystem path requirements in N3940 (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3940.html#path.arg.convert)
Each URI part is required to be a contiguous memory block.
Case normalization shall be performed according to IETF RFC 3986, section 6.2.2.1 and IETF RFC 3987, section 5.3.2.1.
Percent encoding normalization shall be performed according to IETF RFC 3986, section 6.2.2.2 and IETF RFC 3987, section 5.3.2.3.
Removing dot segments (”.”, ”..”) from a path shall conform to IETF RFC 3986, section 5.2.4.
URI References returned by std::experimental::uri::make_relative shall be transformed by using the algorithm in IETF RFC 3986, section 5.2.2.
uri();
uri(const uri& other);
uri(uri&& other) noexcept;
template <class Source, class Allocator = allocator<value_type>>
uri(const Source& source, const Allocator& a = Allocator());
template <class InputIterator, class Allocator = allocator<value_type>>
uri(InputIterator begin, InputIterator end, const Allocator& a = Allocator());
uri& operator= (const uri& other);
uri& operator= (uri&& other) noexcept;
void swap(uri& other) noexcept;
constexpr const_iterator begin() const noexcept;
constexpr const_iterator end() const noexcept;
constexpr const_iterator cbegin() const noexcept;
constexpr const_iterator cend() const noexcept;
constexpr experimental::optional<string_view> scheme() const;
constexpr experimental::optional<string_view> user_info() const;
constexpr experimental::optional<string_view> host() const;
constexpr experimental::optional<string_view> port() const;
template <class intT>
constexpr experimental::optional<intT> port() const;
constexpr experimental::optional<string_view> path() const;
constexpr experimental::optional<string_view> authority() const;
constexpr experimental::optional<string_view> query() const;
constexpr experimental::optional<string_view> fragment() const;
template <class charT,
class traits = char_traits<charT>,
class Allocator = allocator<charT>>
basic_string<charT, traits, Allocator>
to_string(const Allocator& a = Allocator()) const;
Note
The name to_string has been chosen to be consistent with the to_string member function in std::bitset [20.5.2.34]. This, however, is inconsistent with the name (string) in the path class in the filesystem working draft N3940 [path.native.obs]. The same applies to all the string accessors.
string string() const;
wstring wstring() const;
string u8string() const;
u16string u16string() const;
u32string u32string() const;
constexpr bool empty() const noexcept;
constexpr bool is_absolute() const noexcept;
constexpr bool is_relative() const noexcept;
constexpr bool is_hierarchical() const noexcept;
constexpr bool is_opaque() const noexcept;
This proposal specifies three transformer functions:, normalize, make_relative and resolve.
uri normalize(uri_normalization_level level) const;
template <class Allocator>
uri normalize(uri_normalization_level level, const Allocator& a) const;
uri normalize(uri_normalization_level level, error_code& ec) const;
template <class Allocator>
uri normalize(uri_normalization_level level, const Allocator& a,
error_code& ec) const;
uri make_relative(const uri& u) const;
try {
std::experimental::uri u("http://www.example.org/path/");
std::experimental::uri v("http://www.example.org/path/to/file.html");
std::experimental::uri w = u.make_relative(v);
assert(w.string() == "to/file.html");
}
catch (const std::experimental::uri_error& e) {
// handle error
}
template <class Allocator>
uri make_relative(const uri& u, const Allocator& a) const;
std::allocator<std::experimental::uri::value_type> a;
try {
std::experimental::uri u("http://www.example.org/path/", a);
std::experimental::uri v("http://www.example.org/path/to/file.html", a);
std::experimental::uri w = v.make_relative(u, a);
assert(w.string() == "to/file.html");
}
catch (const std::experimental::uri_error& e) {
// handle error
}
uri make_relative(const uri& base, error_code& ec) const;
std::experimental::uri u("http://www.example.org/path/");
std::experimental::uri v("http://www.example.org/path/to/file.html");
std::error_code ec;
std::experimental::uri w = v.make_relative(u, ec);
if (!ec) {
assert(w.string() == "to/file.html");
}
else {
// handle error
}
template <class Allocator>
uri make_relative(const uri& base, const Allocator &a,
error_code& ec) const;
std::allocator<std::experimental::uri::value_type> a;
std::experimental::uri u("http://www.example.org/path/", a);
std::experimental::uri v("http://www.example.org/path/to/file.html", a);
std::error_code ec;
std::experimental::uri w = v.make_relative(u, a, ec);
if (!ec) {
assert(w.string() == "to/file.html");
}
else {
// handle error ec
}
uri resolve(const uri& u) const;
template <class Allocator>
uri resolve(const uri& u, const Allocator& a) const;
uri resolve(const uri& u, error_code& ec) const;
template <class Allocator>
uri resolve(const uri& u, const Allocator& a, error_code& ec) const;
constexpr int compare(const uri& other, uri_normalization_level level) const noexcept;
template <class InputIterator, class OutputIterator>
static OutputIterator encode_user_info(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator encode_host(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator encode_port(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator encode_path(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator encode_query(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator encode_fragment(InputIterator begin, InputIterator end,
OutputIterator out);
template <class InputIterator, class OutputIterator>
static OutputIterator decode(InputIterator begin, InputIterator end,
OutputIterator out);
namespace std {
namespace experimental {
class uri_builder {
public:
// constructors and destructor
uri_builder();
explicit uri_builder(const uri& base_uri);
template <class Source>
explicit uri_builder(const Source& base_uri);
template <class InputIterator>
uri_builder(InputIterator begin, InputIterator end);
uri_builder(const uri_builder& other);
uri_builder(uri_builder&& other) noexcept;
~uri_builder();
// assignment
uri_builder& operator= uri_builder(const uri_builder&);
uri_builder& operator= uri_builder(uri_builder&&) noexcept;
// modifiers
void swap(uri_builder& other) noexcept;
// setters
template <class Source>
uri_builder& scheme(const Source& scheme);
template <class InputIterator>
uri_builder& scheme(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& user_info(const Source& user_info);
template <class InputIterator>
uri_builder& user_info(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& host(const Source& host);
template <class InputIterator>
uri_builder& host(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& port(const Source& port);
template <class InputIterator>
uri_builder& port(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& authority(const Source& authority);
template <class InputIterator>
uri_builder& authority(InputIterator begin, InputIterator end);
template <class UserInfoSource, class HostSource, PortSource>
uri_builder& authority(const UserInfoSource& user_info,
const HostSource& host, const PortSource& port);
template <class Source>
uri_builder& path(const Source& path);
template <class InputIterator>
uri_builder& path(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& append_path(const Source& path);
template <class InputIterator>
uri_builder& append_path(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& query(const Source& query);
template <class InputIterator>
uri_builder& query(InputIterator begin, InputIterator end);
template <class KeySource, class ParamSource>
uri_builder& append_query(const KeySource& key, const ParamSource& param);
template <class Source>
uri_builder& fragment(const Source& fragment);
template <class InputIterator>
uri_builder& fragment(InputIterator begin, InputIterator end);
// builder
uri uri() const;
};
} // namespace experimental
} // namespace std
The URI shall be built according to component recomposition rules in IETF RFC 3986, section 5.3.
uri_builder();
uri_builder(const uri& base_uri);
std::experimental::uri base_uri("http://www.example.com");
std::experimental::uri_builder builder(uri);
builder.path("path");
assert(builder.uri() == "http://www.example.com/path");
template <class Source>
uri_builder(const Source& base_uri);
template <class InputIterator>
uri_builder(InputIterator begin, InputIterator end);
std::experimental::uri_builder builder("http://www.example.com");
builder.path("path");
assert(builder.uri() == "http://www.example.com/path");
uri_builder(const uri_builder& other);
uri_builder(uri_builder&& other) noexcept;
uri_builder& operator= (const uri_builder& other);
uri_builder& operator= (uri_builder&& other) noexcept;
void swap(uri_builder& other) noexcept;
template <class Source>
uri_builder& scheme(const Source& scheme);
template <class InputIterator>
uri_builder& scheme(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& user_info(const Source& user_info);
template <class InputIterator>
uri_builder& user_info(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& host(const Source& host);
template <class InputIterator>
uri_builder& host(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& port(const Source& port);
template <class InputIterator>
uri_builder& port(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& authority(const Source& authority);
template <class InputIterator>
uri_builder& authority(InputIterator begin, InputIterator end);
template <class UserInfoSource, class HostSource, class PortSource>
uri_builder& authority(const UserInfoSource& user_info,
const HostSource& host, const PortSource& port);
template <class Source>
uri_builder& path(const Source& path);
template <class InputIterator>
uri_builder& path(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& append_path(const Source& path);
template <class InputIterator>
uri_builder& append_path(InputIterator begin, InputIterator end);
template <class Source>
uri_builder& query(const Source& query);
template <class InputIterator>
uri_builder& query(InputIterator begin, InputIterator end);
template <class KeySource, class ParamSource>
uri_builder& append_query(const KeySource& key, const ParamSource& param);
template <class Source>
uri_builder& fragment(const Source& fragment);
template <class InputIterator>
uri_builder& fragment(InputIterator begin, InputIterator end);
uri uri() const;
namespace std {
namespace experimental {
class uri_error : public system_error {
public:
uri_error(const string& what_arg, error_code ec);
virtual ~uri_error();
virtual const char *what() const noexcept;
};
} // namespace experimental
} // namespace std
uri_error(const string& what_arg, error_code ec);
const char *what() const noexcept;
enum class uri_errc {
// uri syntax errors
invalid_syntax = 1,
// uri relative reference and resolution errors
base_uri_is_empty,
base_uri_is_not_absolute,
base_uri_is_opaque,
base_uri_does_not_match,
// builder errors
invalid_uri,
invalid_scheme,
invalid_user_info,
invalid_host,
invalid_port,
invalid_path,
invalid_query,
invalid_fragment,
// decoding errors
not_enough_input,
non_hex_input,
conversion_failed,
};
This error is set when the parser is unable to parse the given URI string.
This error is set when the base URI passed to make_relative or resolve is empty.
This error is set when the base URI passed to make_relative or resolve is not absolute (it is itself a relative reference).
This error is set when the base URI passed to make_relative or resolve is opaque.
This error is set when the base URI passed to make_relative does not match the prefix of the URI.
This error is set by the uri_builder when the builder is unable to construct a valid URI.
This error is set by the uri_builder if the scheme provided is invalid.
This error is set by the uri_builder if the user info provided is invalid.
This error is set by the uri_builder if the host provided is invalid.
This error is set by the uri_builder if the port provided is invalid.
This error is set by the uri_builder if the path provided is invalid.
This error is set by the uri_builder if the query provided is invalid.
This error is set by the uri_builder if the fragment provided is invalid.
This error is set when not enough input was given to the decoder to be able to decode the percent encoded string, e.g. %2.
This error is set when non-hex input is given to the decoder, e.g. %GG.
This error is set when the decoder was unable to convert the percent encoded string, e.g. %80.
Note
Issues
1. Normalization Invariant
At the Chicago meeting, it was strongly suggested that URIs are always normalized. However, this does not play well when working with filesystem URIs, especially if they refer to symbolic links and therefore it makes sense in these circumstances to keep the URI unnormalized. Therefore, the changes made after the Chicago meeting have been reverted, and URIs can be left unnormalized. A normalize member function exists to provide the conversion explicitly.
2. Scheme-Specific Normalization
There needs to be an extension point in order to allow scheme- and protocol- specific normalization.
3. empty() vs. is_absolute() vs. is_opaque()
In the minutes to the Chicago meeting, there was a suggestion that the is_ prefix is being applied inconsistently. The current way is consistent with at least the filesystem proposal, but clarification should be made with the LEWG.
4. Factory Functions
The make_uri factory functions are free functions, but the LEWG needs to clarify if they can remain this way or if they should be static members of uri.
5. Source Template Parameters
The Source template parameters seem overly generic, this will be taken up with the LEWG.
6. to_string() etc.
A decision needs to be taken in order to determine if the string accessors [class.uri.members.accessors] should be consistent with std::experimental::filesystem::path, or if they will be different by design.
7. string_view null-state
The new TS has changed basic_string_view so that it has its own null state that can be created by calling the basic_string_view default constructor. There may be a case to remove optional from the URI parts accessors. If that were to happen, the authors of this paper would recommend a more explicit accessor in basic_string_view to test the null state than by checking the pointer returned by data().
Note
C++ Network Library users and mailing list
Kyle Kloepper and Niklas Gustafsson for providing valuable feedback and encouragement, and for presenting different versions of this proposal at committee meetings.
The BSI C++ panel for their guidance and patience in reviewing every draft of this proposal.
Beman Dawes and his Filesystem proposal from which I was influenced strongly in the class design.
Thiago Macieira and Daniel Kruegler for important feedback on the draft proposal.
David Thaler for suggesting corrections to errors in referencing IETF standards.
Wikipedia, for being there.