Parse a URL
You parse a full URL bysetting theCURLUPART_URL
part in the handle:
CURLU *h = curl_url();rc = curl_url_set(h, CURLUPART_URL, "https://example.com:449/foo/bar?name=moo", 0);
If successful,rc
containsCURLUE_OK
and the different URL components areheld in the handle. It means that the URL was valid as far as libcurlconcerns.
The function call's forth argument is a bitmask. Set none, one or more bits inthat to alter the parser's behavior:
CURLU_NON_SUPPORT_SCHEME
Makescurl_url_set()
accept a non-supported scheme. If not set, the onlyacceptable schemes are for the protocols libcurl knows and have built-insupport for.
CURLU_URLENCODE
Makes the function URL encode the path part if any bytes in it would benefitfrom that: like spaces or "control characters".
CURLU_DEFAULT_SCHEME
If the passed in string does not use a scheme, assume that the default one wasintended. The default scheme is HTTPS. If this is not set, a URL without ascheme part is not accepted as valid. Overrides theCURLU_GUESS_SCHEME
option if both are set.
CURLU_GUESS_SCHEME
Makes libcurl allow the URL to be set without a scheme and it instead"guesses" which scheme that was intended based on the hostname. If theoutermost sub-domain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP thenthat scheme is used, otherwise it picks HTTP. Conflicts with theCURLU_DEFAULT_SCHEME
option which takes precedence if both are set.
CURLU_NO_AUTHORITY
Skips authority checks. The RFC allows individual schemes to omit the hostpart (normally the only mandatory part of the authority), but libcurl cannotknow whether this is permitted for custom schemes. Specifying the flag permitsempty authority sections, similar to how the file scheme is handled. Reallyonly usable in combination withCURLU_NON_SUPPORT_SCHEME
.
CURLU_PATH_AS_IS
Makes libcurl skip the normalization of the path. That is the procedure wherecurl otherwise removes sequences of dot-slash and dot-dot etc. The same optionused for transfers is calledCURLOPT_PATH_AS_IS
.
CURLU_ALLOW_SPACE
Makes the URL parser allow space (ASCII 32) where possible. The URL syntaxdoes normally not allow spaces anywhere, but they should be encoded as%20
or+
. When spaces are allowed, they are still not allowed in thescheme. When space is used and allowed in a URL, it is stored as-is unlessCURLU_URLENCODE
is also set, which then makes libcurl URL-encode the spacebefore stored. This affects how the URL is constructed whencurl_url_get()
is subsequently used to extract the full URL or individual parts.