Add CGI docs

This commit is contained in:
Denis Vlasenko 2007-02-11 14:52:07 +00:00
parent ad67a3925c
commit 136f42f503
6 changed files with 3057 additions and 0 deletions

46
docs/cgi/cl.html Normal file
View File

@ -0,0 +1,46 @@
<html><head><title>CGI Command line options</title></head><body><h1><img alt="" src="cl_files/CGIlogo.gif"> CGI Command line options</h1>
<hr> <p>
</p><h2>Specification</h2>
The command line is only used in the case of an ISINDEX query. It is
not used in the case of an HTML form or any as yet undefined query
type. The server should search the query information (the <code>QUERY_STRING</code> environment variable) for a non-encoded
= character to determine if the command line is to be used, if it
finds one, the command line is not to be used. This trusts the clients
to encode the = sign in ISINDEX queries, a practice which was
considered safe at the time of the design of this specification. <p>
For example, use the <a href="http://hoohoo.ncsa.uiuc.edu/cgi-bin/finger">finger script</a> and the ISINDEX interface to look up "httpd". You will see that the script will call itself with <code>/cgi-bin/finger?httpd</code> and will actually execute "finger httpd" on the command line and output the results to you.
</p><p>
If the server does find a "=" in the <code>QUERY_STRING</code>,
then the command line will not be used, and no decoding will be
performed. The query then remains intact for processing by an
appropriate FORM submission decoder.
Again, as an example, use <a href="http://hoohoo.ncsa.uiuc.edu/cgi-bin/finger?httpd=name">this hyperlink</a> to submit <code>"httpd=name"</code> to the finger script. Since this <code>QUERY_STRING</code>
contained an unencoded "=", nothing was decoded, the script didn't know
it was being submitted a valid query, and just gave you the default
finger form.
</p><p>
If the server finds that it cannot send the string due to internal
limitations (such as exec() or /bin/sh command line restrictions) the
server should include NO command line information and provide the
non-decoded query information in the environment
variable <a href="http://hoohoo.ncsa.uiuc.edu/cgi/env.html#query"><code>QUERY_STRING</code></a>. </p><p>
</p><hr>
<h2>Examples</h2>
Examples of the command line usage are much better <a href="http://hoohoo.ncsa.uiuc.edu/cgi/examples.html">demonstrated</a> than explained. For these
examples, pay close attention to the script output which says what
argc and argv are. <p>
</p><hr>
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"><img alt="[Back]" src="cl_files/back.gif">Return to the
interface specification</a> <p>
CGI - Common Gateway Interface
</p><address><a href="http://hoohoo.ncsa.uiuc.edu/cgi/mailtocgi.html">cgi@ncsa.uiuc.edu</a></address>
</body></html>

149
docs/cgi/env.html Normal file
View File

@ -0,0 +1,149 @@
<html><head><title>CGI Environment Variables</title></head><body><h1><img alt="" src="env_files/CGIlogo.gif"> CGI Environment Variables</h1>
<hr>
<p>
In order to pass data about the information request from the server to
the script, the server uses command line arguments as well as
environment variables. These environment variables are set when the
server executes the gateway program. </p><p>
</p><hr>
<h2>Specification</h2>
<p>
The following environment variables are not request-specific and are
set for all requests: </p><p>
</p><ul>
<li> <code>SERVER_SOFTWARE</code> <p>
The name and version of the information server software answering
the request (and running the gateway). Format: name/version </p><p>
</p></li><li> <code>SERVER_NAME</code> <p>
The server's hostname, DNS alias, or IP address as it would appear
in self-referencing URLs. </p><p>
</p></li><li> <code>GATEWAY_INTERFACE</code> <p>
The revision of the CGI specification to which this server
complies. Format: CGI/revision</p><p>
</p></li></ul>
<hr>
The following environment variables are specific to the request being
fulfilled by the gateway program: <p>
</p><ul>
<li> <a name="protocol"><code>SERVER_PROTOCOL</code></a> <p>
The name and revision of the information protcol this request came
in with. Format: protocol/revision </p><p>
</p></li><li> <code>SERVER_PORT</code> <p>
The port number to which the request was sent. </p><p>
</p></li><li> <code>REQUEST_METHOD</code> <p>
The method with which the request was made. For HTTP, this is
"GET", "HEAD", "POST", etc. </p><p>
</p></li><li> <code>PATH_INFO</code> <p>
The extra path information, as given by the client. In other
words, scripts can be accessed by their virtual pathname, followed
by extra information at the end of this path. The extra
information is sent as PATH_INFO. This information should be
decoded by the server if it comes from a URL before it is passed
to the CGI script.</p><p>
</p></li><li> <code>PATH_TRANSLATED</code> <p>
The server provides a translated version of PATH_INFO, which takes
the path and does any virtual-to-physical mapping to it. </p><p>
</p></li><li> <code>SCRIPT_NAME</code> <p>
A virtual path to the script being executed, used for
self-referencing URLs. </p><p>
</p></li><li> <a name="query"><code>QUERY_STRING</code></a> <p>
The information which follows the ? in the <a href="http://www.ncsa.uiuc.edu/demoweb/url-primer.html">URL</a>
which referenced this script. This is the query information. It
should not be decoded in any fashion. This variable should always
be set when there is query information, regardless of <a href="http://hoohoo.ncsa.uiuc.edu/cgi/cl.html">command line decoding</a>. </p><p>
</p></li><li> <code>REMOTE_HOST</code> <p>
The hostname making the request. If the server does not have this
information, it should set REMOTE_ADDR and leave this unset.</p><p>
</p></li><li> <code>REMOTE_ADDR</code> <p>
The IP address of the remote host making the request. </p><p>
</p></li><li> <code>AUTH_TYPE</code> <p>
If the server supports user authentication, and the script is
protects, this is the protocol-specific authentication method used
to validate the user. </p><p>
</p></li><li> <code>REMOTE_USER</code> <p>
If the server supports user authentication, and the script is
protected, this is the username they have authenticated as. </p><p>
</p></li><li> <code>REMOTE_IDENT</code> <p>
If the HTTP server supports RFC 931 identification, then this
variable will be set to the remote user name retrieved from the
server. Usage of this variable should be limited to logging only.
</p><p>
</p></li><li> <a name="ct"><code>CONTENT_TYPE</code></a> <p>
For queries which have attached information, such as HTTP POST and
PUT, this is the content type of the data. </p><p>
</p></li><li> <a name="cl"><code>CONTENT_LENGTH</code></a> <p>
The length of the said content as given by the client. </p><p>
</p></li></ul>
<a name="headers"><hr></a>
In addition to these, the header lines received from the client, if
any, are placed into the environment with the prefix HTTP_ followed by
the header name. Any - characters in the header name are changed to _
characters. The server may exclude any headers which it has already
processed, such as Authorization, Content-type, and Content-length. If
necessary, the server may choose to exclude any or all of these
headers if including them would exceed any system environment
limits. <p>
An example of this is the HTTP_ACCEPT variable which was defined in
CGI/1.0. Another example is the header User-Agent.</p><p>
</p><ul>
<li> <code>HTTP_ACCEPT</code> <p>
The MIME types which the client will accept, as given by HTTP
headers. Other protocols may need to get this information from
elsewhere. Each item in this list should be separated by commas as
per the HTTP spec. </p><p>
Format: type/subtype, type/subtype </p><p>
</p></li><li> <code>HTTP_USER_AGENT</code><p>
The browser the client is using to send the request. General
format: <code>software/version library/version</code>.</p><p>
</p></li></ul>
<hr>
<h2>Examples</h2>
Examples of the setting of environment variables are really much better
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/examples.html">demonstrated</a> than explained. <p>
</p><hr>
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"><img alt="[Back]" src="env_files/back.gif">Return to the
interface specification</a> <p>
CGI - Common Gateway Interface
</p><address><a href="http://hoohoo.ncsa.uiuc.edu/cgi/mailtocgi.html">cgi@ncsa.uiuc.edu</a></address>
</body></html>

33
docs/cgi/in.html Normal file
View File

@ -0,0 +1,33 @@
<html><head><title>CGI Script input</title></head><body><h1><img alt="" src="in_files/CGIlogo.gif"> CGI Script Input</h1>
<hr>
<h2>Specification</h2>
For requests which have information attached after the header, such as
HTTP POST or PUT, the information will be sent to the script on stdin.
<p>
The server will send <a href="http://hoohoo.ncsa.uiuc.edu/cgi/env.html#cl">CONTENT_LENGTH</a> bytes on
this file descriptor. Remember that it will give the <a href="http://hoohoo.ncsa.uiuc.edu/cgi/env.html#ct">CONTENT_TYPE</a> of the data as well. The server is
in no way obligated to send end-of-file after the script reads
<code>CONTENT_LENGTH</code> bytes. </p><p>
</p><hr>
<h2>Example</h2>
Let's take a form with METHOD="POST" as an example. Let's say the form
results are 7 bytes encoded, and look like <code>a=b&amp;b=c</code>.
<p>
In this case, the server will set CONTENT_LENGTH to 7 and CONTENT_TYPE
to application/x-www-form-urlencoded. The first byte on the script's
standard input will be "a", followed by the rest of the encoded string.</p><p>
</p><hr>
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"><img alt="[Back]" src="in_files/back.gif">Return to the
interface specification</a> <p>
CGI - Common Gateway Interface
</p><address><a href="http://hoohoo.ncsa.uiuc.edu/cgi/mailtocgi.html">cgi@ncsa.uiuc.edu</a></address>
</body></html>

29
docs/cgi/interface.html Normal file
View File

@ -0,0 +1,29 @@
<html><head><title>The Common Gateway Interface Specification
[http://hoohoo.ncsa.uiuc.edu/cgi/interface.html]
</title></head><body><h1><img alt="" src="interface_files/CGIlogo.gif"> The CGI Specification</h1>
<hr>
This is the specification for CGI version 1.1, or CGI/1.1. Further
revisions of this protocol are guaranteed to be backward compatible.
<p>
The server and the CGI script communicate in four major ways. Each of
the following is a hotlink to graphic detail.</p><p>
</p><ul>
<li> <a href="env.html">Environment variables</a>
</li><li> <a href="cl.html">The command line</a>
</li><li> <a href="in.html">Standard input</a>
</li><li> <a href="out.html">Standard output</a>
</li></ul>
<hr>
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/overview.html"><img alt="[Back]" src="interface_files/back.gif">Return to the overview</a> <p>
CGI - Common Gateway Interface
</p><address><a href="http://hoohoo.ncsa.uiuc.edu/cgi/mailtocgi.html">cgi@ncsa.uiuc.edu</a></address>
</body></html>

126
docs/cgi/out.html Normal file
View File

@ -0,0 +1,126 @@
<html><head><title>CGI Script output</title></head><body><h1><img alt="" src="out_files/CGIlogo.gif"> CGI Script Output</h1>
<hr>
<h2>Script output</h2>
The script sends its output to stdout. This output can either be a
document generated by the script, or instructions to the server for
retrieving the desired output. <p>
</p><hr>
<h2>Script naming conventions</h2>
Normally, scripts produce output which is interpreted and sent back to
the client. An advantage of this is that the scripts do not need to
send a full HTTP/1.0 header for every request. <p>
<a name="nph">
Some scripts may want to avoid the extra overhead of the server
parsing their output, and talk directly to the client. In order to
distinguish these scripts from the other scripts, CGI requires that
the script name begins with nph- if a script does not want the server
to parse its header. In this case, it is the script's responsibility
to return a valid HTTP/1.0 (or HTTP/0.9) response to the client. </a></p><p>
</p><hr>
<h2><a name="nph">Parsed headers</a></h2>
<a name="nph">The output of scripts begins with a small header. This header consists
of text lines, in the same format as an </a><a href="http://www.w3.org/hypertext/WWW/Protocols/HTTP/Object_Headers.html">
HTTP header</a>, terminated by a blank line (a line with only a
linefeed or CR/LF). <p>
Any headers which are not server directives are sent directly back to
the client. Currently, this specification defines three server
directives:</p><p>
</p><ul>
<li> <code>Content-type</code> <p>
This is the MIME type of the document you are returning. </p><p>
</p></li><li> <code>Location</code> <p>
This is used to specify to the server that you are returning a
reference to a document rather than an actual document. </p><p>
If the argument to this is a URL, the server will issue a redirect
to the client. </p><p>
If the argument to this is a virtual path, the server will
retrieve the document specified as if the client had requested
that document originally. ? directives will work in here, but #
directives must be redirected back to the client.</p><p>
</p></li><li> <a name="status"><code>Status</code></a><p>
This is used to give the server an HTTP/1.0 <a href="http://www.w3.org/hypertext/WWW/Protocols/HTTP/HTRESP.html">status
line</a> to send to the client. The format is <code>nnn xxxxx</code>,
where <code>nnn</code> is the 3-digit status code, and
<code>xxxxx</code> is the reason string, such as "Forbidden".</p><p>
</p></li></ul>
<hr>
<h2>Examples</h2>
Let's say I have a fromgratz to HTML converter. When my converter is
finished with its work, it will output the following on stdout (note
that the lines beginning and ending with --- are just for illustration
and would not be output): <p>
</p><pre>--- start of output ---
Content-type: text/html
--- end of output ---
</pre>
Note the blank line after Content-type. <p>
Now, let's say I have a script which, in certain instances, wants to
return the document <code>/path/doc.txt</code> from this server just
as if the user had actually requested
<code>http://server:port/path/doc.txt</code> to begin with. In this
case, the script would output: </p><p>
</p><pre>--- start of output ---
Location: /path/doc.txt
--- end of output ---
</pre>
The server would then perform the request and send it to the client.
<p>
Let's say that I have a script which wants to reference our gopher
server. In this case, if the script wanted to refer the user to
<code>gopher://gopher.ncsa.uiuc.edu/</code>, it would output: </p><p>
</p><pre>--- start of output ---
Location: gopher://gopher.ncsa.uiuc.edu/
--- end of output ---
</pre>
Finally, I have a script which wants to talk to the client directly.
In this case, if the script is referenced with <a href="http://hoohoo.ncsa.uiuc.edu/cgi/env.html#protocol"><code>SERVER_PROTOCOL</code></a> of HTTP/1.0,
the script would output the following HTTP/1.0 response: <p>
</p><pre>--- start of output ---
HTTP/1.0 200 OK
Server: NCSA/1.0a6
Content-type: text/plain
This is a plaintext document generated on the fly just for you.
--- end of output ---
</pre>
<hr>
<a href="http://hoohoo.ncsa.uiuc.edu/cgi/interface.html"><img alt="[Back]" src="out_files/back.gif">Return to the
interface specification</a> <p>
CGI - Common Gateway Interface
</p><address><a href="http://hoohoo.ncsa.uiuc.edu/cgi/mailtocgi.html">cgi@ncsa.uiuc.edu</a></address>
</body></html>

File diff suppressed because it is too large Load Diff