chapter2.html

<!doctype html>
<html>

<head>
  <title>SQL basics</title>
  <link rel=stylesheet href="http://fonts.googleapis.com/css?family=Averia+Serif+Libre:300,400">
  <link rel=stylesheet href=style.css>
  <meta http-equiv=Content-Type content="text/html; charset=utf-8">
</head>
<body>

<h1>SQL basics</h1>

<h2>Simple queries</h2>
<p>In a SQL database, data is stored in a set of tables. A table contains a collection of entries (="lines"), information of the same type such as personal data or phone contract information.</p>

<p align=center><img src="table1.png" alt="person table" width=650></p>
<p align=center><img src="table2.png" alt="phone_contract table" width=650></p>

<p>We want to see what is in the table person:</p>
<pre>select * from person;</pre>
<p>This is too much information as we are only interested in the name:</p>
<pre>select name from person;</pre>
<p>In general:</p>
<pre>select col1, col2, ... from tablename;</pre>
<p>Note that <code>select *</code> selects <em>all</em> columns of the respective table. We can get the number of lines of results by using the count command. To get the number of entries in the person table we use</p>
<pre>select count(*) from person;</pre>
<p>Results might contain duplicates (they form a multi-set). We can get only the distinct results by</p>
<pre>select distinct col1,col2, ... from tablename;</pre>
<p>Note that you should always enter only a <em>single</em> SQL-statement at once. Now you have learned enough start writing SQL queries and to answer the first question
<p class=question>1.) Return a list of residences without duplicates.</p>
<p class=question>2.) How many different cities of residence do the people in the persons table have?</p>

<h2>Constraints</h2>
<p>Quite often we are not interested in all of the entries of the database, especially because it might contain a huge number of lines. To get only the lines fulfilling some constraints, the <code>where</code> clause is used, e.g:</p>
<pre>select name, residence from person where age > 30;</pre>
<p>For the different type of data, different constrains make sense. For all kind of data, we can check for equal <code>=</code> and not equal <code>!=</code>. For numbers, one can also compare for <em>larger</em> <code>></code>, <em>larger or equal</em> <code>>=</code>, <em>smaller</em> <code><</code>, <em>smaller or equal</em> <code><=</code>.</p>

<p>For text, we use <code>like</code>, to find entries where the text value in a column contains a string, where you can use the <code>%</code> as a wild-card for unknown parts of a word. E.g. if you search for <code>'%an%'</code>, the following words would fit: B<strong>an</strong>ana, f<strong>an</strong>tasy. Note that in SQLite, the <code>like</code> operator is <em>case insensitive</em> by default.

<p class=question>3.) Get all people whose names contains "oh".</p>
<p>In SQLite the <code>></code>,<code><</code>,<code>=</code> operators for dates, do an <em>lexicographical</em> comparison, not a <em>numeric</em> one. This means that if there is an inconsistent representation of dates, the comparison might fail even though the two values represent the same dates. E.g. if once the format <code>"2014-10-23"</code> and once the format <code>"2014/10/16"</code> is used. Note that for dates double quotes <code>"</code> are used instead of the single quotes <code>'</code>. used for text. (No quotes used for numbers)</p>
There is a table named <code>flight</code> containg flight data. The column with the dates is called <code>date</code>.
<p class=question>4.) Get all flights before October 20.</p>

<p>Furthermore, the operator <code>in</code> checks if the field on the left side of the operator is a member of the set right to the operator. The two following statements are equivalent</p>
<pre>
name in ('Philipp', 'Carlos')
</pre><pre>
name = 'Philipp' or name = 'Carlos'
</pre>
<p class=question>5.)Formulate two suitable queries to demonstrate that.</p>

<p>Constraints can be combined by the logical operator <code>and</code> and the operator <code>or</code>.</p>
<p class=question>6.) Find all people whose first name is Philipp and who are older than 50 years.</p>
<p>There is another table which is called <code>phone_contract</code>. The column containing the status (active, non-active) is called <code>status</code>.</p>
<p class=question>7.) Find out, how many contracts are there. With a second query find out how many of them are active.</p>

<h2>Results from more than one table</h2>
<p>Usually, it is necessary to get information which is spread over more than one table. For example, you may want to know the names of the people who have an active phone contract.</p>
<p>When you know what you are looking for you are already able to formulate a concrete condition in the where clause. With SQL you can combine information from various tables, e.g.</p>
<pre>select person.name, phone_contract.phone_number
from person, phone_contract;
</pre>
<p>Note that we could also write <code>phone_number</code> instead of <code>phone_contract.phone_number</code> as the table person does not contain a column called <code>phone_number</code> and, therefore, no ambiguity could arise.</p>
<p class=question>8.) What does this query return?</p>
<p>To output the matching names and phone numbers, we can use a <code>where</code> clause again:</p>
<pre>select p.name, c.phone_number from person p, phone_contract c
where c.name = p.name;</pre>
<p>As we did not want to write the complete table name, we have defined and abbreviation in the <code>from</code> clause (<code>person p</code>).</p>
<p class=question>9.) Find all names and ages of persons who have an active phone contract.</p>
<p class=question>10.) Find all names and phone numbers of people who are older than 30.</p>

<h2>Sub-queries</h2>
<p>The methods described above allow to get information from an arbitrary number of tables. But if you want to formulate a condition which depends on values of certain data entries, you need sub-queries. </p>

<p class=question>11.)What does the following query do?</p>
<pre>
select person.name, person.age, residence, phone_number from person, phone_contract
where person.name = phone_contract.name and
age in (select age from person where residence = 'Paris');
</pre>
<p>The part in the last line in parentheses is called a <em>sub-query</em>. Such sub-queries can be a little bit confusing at first sight so you might want to pause for a moment.</p>
<p>Now let's look at the statement above a little bit closer. First, only perform the sub-query
</p>
<pre>
select age from person where residence = "Paris";
</pre>
<p>
and look at the results. It returns a the <em>list</em> of all ages of persons with residence in Paris. As a next step, try the query
</p>
</pre>
select person.name, person.age, residence from person 
where age in (21,36);
</pre>
<p>
Now you can replace the <code>where</code> clause whith the sub-query above without altering the result. Try this! The last step to get to the complete statement is to add the matching <code>phone_mumber</code> from the table <code>phone_contract</code>.
</p>
<p> 
This technique of decomposing a complex SQL-statement into its simpler parts and than reassembling them is offen extremely helpful when dealing with new or unfamiliar constructs.
</p>


<p>If you have no more question concerning basics SQL queries, you can procede to the next chapter and solve <a href="chapter3.html">the theft of the Mona Lisa</a>.</p>

<p><a href="chapter2_solutions.html">Solutions</a></p>

<p><a href="index.html">Back to start</a></p>

</body>


</html>