We now have the Python tools we need to create our first web applications. The process of creating web applications is called web application development, the raison d’être for this book.
The World Wide Web – or more simply the web – is a global system of linked documents accessed through the Internet, which is itself a global computer network.
The web uses a client-server model through which web pages are retrieved from web servers and then viewed in software applications running on the client computer called web browsers.
Web pages are located using an addressing system called a uniform resource locator or URL. A URL looks like this:
This URL consists of three parts. The beginning of the URL, http://
is the URI scheme or protocol. This one is using the Hypertext Transfer Protocol or HTTP. Other schemes you are likely to encounter include HTTPS, FTP, and SSH. The middle part, openbookproject.net
, specifies the server where this resources resides. The last part, /books/bpp4awd
identifies the specific resource on this server.
The process of asking for a web page is called an HTTP request, and the exchange of messages between the client and the server is called the request-response-cycle.
HTML stands for HyperText Mark-up Language. An HTML document is all plain text. Because it must be able to express the structure of this text, information about which text is a heading, which text is paragraph, and so on, a few characters have a special meaning, somewhat like backslashes in Python strings. The “less than” and “greater than” characters are used to create HTML tags. Most tags occure in pairs, with a start tag and an end tag, with text data between them. The start and end tag together with the enclosed text form an HTML element.
Elements provide extra information about the data in the document. They can stand on their own, for example to mark the place where a picture should appear in the page, or they can contain text and other elements, for example when they mark the start and end of a paragraph.
Some elements are compulsory, a whole HTML document must always be contained in an html
element. Here is an example of an HTML document:
A rendered version of this web page can be see here.
Elements that contain text or other tags are first opened with <tagname>
, and afterwards finished with </tagname>
. The html
element always contains two children: head
and body
. The first contains information about the document, the second contains the actual document.
Most tag names are cryptic abbreviations. h1
stands for “heading 1”, the top level heading. There are also h2
to h6
for successive subheadings. p
means “paragraph”, and img
stands for “image”. The img
element does not contain any text or other tags, but it does have some extra information, src="timpeters.jpg"
and alt="Tim Peters"
, which are called attributes. In this case, they contain information about the image file that should be shown here.
Because <
and >
have a special meaning in HTML documents, they can not be written directly in the text of the document. If you want to display 5 < 10
in an HTML document, you have to write “5 < 10
”, where <
represents the less than than sign (<
). >
is used for >
, and because these codes also give the ampersand character a special meaning, a plain &
is written as &
.
These are the only bare basics of HTML, but they should be enough to get you through this chapter. As an aspiring web developer, you will want to learn more about HTML as soon as you can.
CSS stands for Cascading Style Sheets. CSS is a styling language designed to describe the look and formatting (the presentation semantics) of web pages. Together with HTML and JavaScript, it makes up the third of the three languages that can be natively consumed by web browsers.
CSS syntax consists of a collection of styles or rules. Each rule is composed of a selector and a declaration block. The selector determines (selects) which HTML elements the style will apply to. The declaration block is in turm composed of a sequence of property-value pairs. The property is separated from the value by a colon (:
), and property-value pairs are separated from each other by a semi-colon (;
).
Here is an example of a style sheet:
Styles can be applied internally to an html document using style elements (between <style type="text/css"></style>
tags) in the document header. Here is the preceding quote web page with the style included:
Here is this web page rendered by your browser.
class
and id
attributes can be added to HTML elements for the purpose of styling them with CSS. In this example, the second parapraph element in the blockquote has been given a classed named “author”. This example also makes use of a number of Web colors.
Learning more about HTML and CSS
A working knowledge of HTML and CSS is a prerequisit for creating web applications. Presentation of the details is outside the scope of this book. A quick but sufficient introduction to both of these topics can be found in Getting Down with HTML and Getting Down with CSS.
In some cases, it is also practical to have a program that runs after the page has been sent, when the user is looking at it. This is called client-side scripting, because the program runs on the client computer. Client-side web scripting is what JavaScript was invented for.
The scripts are enclosed in script elements (between <script></script>
tags), usually in the document head. In addition to knowing how to render HTML styled with CSS, almost all current web browsers have built-in JavaScript engines that enable them to interpret JavaScript source included in script elements.
It is also possible to include JavaScript source in a separate file. Browsers load JavaScript files when they find a start <script>
tag in a web page with a src
attribute whose value is the URL of file containing the JavaScript code. The extension .js
is usually used for files containing JavaScript code. These files can be located on the same machine with the web page or anywhere on the web. The browser will fetch all these extra files from their servers, so it can add them to the document.
Like Python, JavaScript is a programming language. As an aspiring web developer, you will need to learn JavaScript in addition to HTML and CSS.
Although a URL can simply point at a file, it is also possible for a web-server to do something more than just looking up a file and sending it to the client. It can process the file in some way first, or even create it dynamically upon receiving the URL request.
Programs that transform or generate documents on a server are what web applications are made of.
To get information (data) from the client to the server, HTML uses forms.
A basic HTTP request is a simple request for a file. When this file is not really a passive file, but a server-side program, it can become useful to include information other than a filename in the request. For this purpose, HTTP requests are allowed to contain additional ‘parameters’. Here is an example:
After the filename (/search
), the URL continues with a question mark, after which the parameters follow. This request has one parameter, called q
(for “query”, presumably), whose value is aztec empire
. The %20
part corresponds to a space. There are a number of characters that can not occur in these values, such as spaces, ampersands, or question marks. These are “escaped” by replacing them with a %
followed by their numerical value, which serves the same purpose as the backslashes used in strings and regular expressions, but is even more unreadable.
Note
The value a character gets is decided by the ASCII standard, which assigns the numbers 0 to 127 to a set of letters and symbols used by the Latin alphabet. This standard is a precursor of the Unicode standard.
When a request contains more than one parameter, they are separated by ampersands, as in…:
A form, basically, is a way to make it easy for browser-users to create such parameterised URLs. It contains a number of fields, such as input boxes for text, checkboxes that can be “checked” and “unchecked”, or thingies that allow you to choose from a given set of values. It also usually contains a “submit” button and, invisible to the user, an “action” URL to which it should be sent. When the submit button is clicked, or enter is pressed, the information that was entered in the fields is added to this action URL as parameters, and the browser will request this URL.
Here is the HTML for a simple form
The name of the form can be used to access it with JavaScript, as we shall see in a moment. The names of the fields determine the names of the HTTP parameters that are used to store their values. Sending this form might produce a URL like this:
There are quite a few other tags and properties that can be used in forms, but in this book we will stick with simple ones, so that we can concentrate on JavaScript.
get
and post
The method="get"
property of the example form shown above indicates that this form should encode the values it is given as URL parameters, as shown before. There is an alternative method for sending parameters, which is called post
. An HTTP request using the post
method contains, in addition to a URL, a block of data. A form using the post
method puts the values of its parameters in this data block instead of in the URL.
When sending big chunks of data, the get
method will result in URLs that are a mile wide, so post
is usually more convenient. But the difference between the two methods is not just a question of convenience. Traditionally, get
requests are used for requests that just ask the server for some document, while post
requests are used to take an action that changes something on the server. For example, getting a list of recent messages on an Internet forum would be a get
request, while adding a new message would be a post
request. There is a good reason why most pages follow this distinction ― programs that automatically explore the web, such as those used by search engines, will generally only make get
requests. If changes to a site can be made by get
requests, these well-meaning ‘crawlers’ could do all kinds of damage.
client
A program that accesses the services offered by a server.network
A collection of computers and hardware components interconnected by communications channels enabling the sharing of information and resources.protocol
A system of digital message formats and rules for exchanging messages in and between computing systems.server
A program offers services to client programs.web application
A computer program that uses a web browser as a client.web browser
A computer application for retrieving, presenting and traversing web pages.web page
A document that is viewable in a web browser. Web pages are usually written in HTML and styled with CSS.web server
A software application for serving web pages.World Wide Web
The global system of linked documents accessed through the Internet.