Managing Data with SQL

Adding Data with INSERT and COPY

Once you have created your table with the necessary specifications, the next logical step is of course to fill the table with data. There are generally three methods in PostgreSQL with which you can fill a table with data:

Inserting new values

The following is the syntax of the INSERT INTO command, when used to insert new values, which is described in detail in Table 4-4:

  INSERT INTO table_name 
         [ ( column_name [, ... ] ) ]
         VALUES ( value [, ... ] )

Table 4-4. INSERT INTO ... VALUES Syntax

Element

Description

INSERT INTO table_name

The SQL command which initiates an insertion of data into the table called table_name.

[ ( column_name, [, ... ]) ]

An optional grouped expression which describes the targeted columns for the insertion. If omitted, PostgreSQL assumes that the values following the VALUES clause apply to the literal left-to-right arrangement of columns in the table called table_name.

VALUES

The SQL clause which instructs PostgreSQL to expect a grouped expression of values to follow.

( values [, ...] )

The required grouped expression which describes the values to be inserted; there should be one value for each specified column, separated by commas. These values may be expressions themselves (e.g., an operation between two values), or constants. Each value must be of the data type of the column which it is being inserted into. If the optional column-target expression is omitted, PostgreSQL will expect there to be one value for each column in the literal order of the table's structure. If there are fewer values to be inserted than columns, PostgreSQL will attempt to insert a default value if one is specified (or the NULL value, if there is no default) for each omitted value.

To demonstrate, Example 4-15 illustrates the insertion of some new books into Book Town's books table.

Example 4-15. Inserting New Values into the "books" Table

booktown=# INSERT INTO books (id, title, author_id, subject_id)
booktown-#        VALUES (41472, 'Practical PostgreSQL', 1212, 4);
INSERT 3574037 1

This SQL statement specifies to insert an id of 41472, a title of Practical PostgreSQL, an author identifier of 1212, and a subject identifier of 4. The returned INSERT statement indicates that the insertion was successful. The numbers following the INSERT result in Example 4-15 are the OID (object identifier) of the freshly inserted row, and the number of rows that were inserted (in this case, 1).

Notice that the optional column target list is specified identically to the structure of the table. In this case, omitting that grouped expression would have an no effect on the statement, since it defaults to the assumption that you are inserting in that order if unspecified. You can also re-arrange the names of the columns in the grouped target list if you wish to specify the values in a different order following the VALUES clause, as demonstrated in Example 4-16.

Example 4-16. Changing the Order of Target Columns

booktown=# INSERT INTO books (subject_id, author_id, id, title)
booktown-#        VALUES (4, 7805, 41473, 'Programming Python');
INSERT 3574041 1

Inserting values from other tables with SELECT

If you already have values within one table (or across several other tables) that you wish to insert into a separate table, this can also be achieved with the INSERT INTO command. The following syntax describes this technique, where query is any valid SQL query which returns values which will be suitable for insertion into the table called table_name:

  INSERT INTO table_name 
         [ ( column_name [, ... ] ) ]
         query

Similar to the syntax of INSERT INTO presented in the previous section, you may optionally specify which columns you wish to insert into, and in what order to expect them. You do not provide the VALUES keyword with this form of INSERT INTO, however, instead providing a complete SQL SELECT statement in its place.

For example, imagine that Book Town keeps a table called book_queue, which holds books waiting to be approved. When approved, those values need to be moved from the queue, and into the normal books table. This can be achieved with the syntax demonstrated in Example 4-17.

Example 4-17. Inserting Values from Another Table

booktown=# INSERT INTO books (id, title, author_id, subject_id)
booktown-#        SELECT nextval('book_ids'), title, author_id, subject_id
booktown-#               FROM book_queue WHERE approved;
INSERT 0 2 

The preceding example demonstrates the insertion of two rows from the table book_queue into the books table. This is performed by way of a SELECT statement which is passed to the INSERT INTO command. Any valid SELECT statement may be used in this context. In this case, the query selects the result of a function called nextval() from a sequence called book_ids, followed by the title, author_id and subject_id columns from the book_queue table.

Since more than one row is being inserted, the INSERT result indicating success returns 0 in place of the OID that would be returned had a single row been inserted. The second number, as with a normal INSERT INTO, returns the number of rows inserted (in this case, 2).

Creating tables from other tables

If you have an existing database table and you want to copy the complete table (both the column structure, and the row data), you can use the SELECT INTO command:

  SELECT * INTO TABLE 
newTableName
   FROM  
tableToCopyFrom;

This performs an implicit CREATE TABLE, which creates a table with the same column names, value types, and row data as the original table. Once the message SELECT is displayed, the statement was successfully performed and a new table was created. You can then perform the same operations on this new table as in any other table that you create.

After having inserted these values, the parser returns the message, INSERT. This means it was successfully inserted into the columns. If you want to view it, use the select * from book command. The table looks like this:

  isbn   |            title            | author_id | cpright_date
---------+-----------------------------+-----------+--------------
 8741205 | The Cat in the Hat          |       110 | 1972-04-29
 2567841 | Little Women                |       231 | 1980-02-15
  214872 | Franklin in the Dark        |       142 | 1986-09-22
  602071 | Goodnight Moon              |       136 | 1991-04-01
  355184 | Bartholomew and the Oobleck |       110 | 1997-02-01
  676883 | The Velveteen Rabbit        |       220 | 1958-02-06
(6 rows)

NoteIdentifier Values
 

If you inserted the date as month, day, and year (mm-dd-yyyy), then PostgreSQL changes the format to look like the set format in your system, so when you look at your table after having created it, the date can be formatted differently from the way you entered it in. Therefore, you must be careful when entering days with a value equal to and less than 12 as first in the string of dates. The parser may see it as the month, not as day, or vice versa. Of course, this depends upon the date format set on your system.

You must enter the values for title and cpright_date in quotes because they are string and date formated data types. On the other hand, isbn and author_id can be entered without quotes because they are numeric data types.

You may have also noticed that isbn is entered in lowercase, while it1 was originally created as ISBN. That is because when we created it, we didn't specify ISBN in quotes. If you originally used quotes upon creating the isbn column, then you must have the exact case structure as you first created it. See the identifiers and keywords section for more information.

Copying values from external files

You can also use the COPY command to populate rows with values. This command copies from an ASCII file and inserts the values into the table rows. The path name of the file must be available to the backend server machine, since the backend server reads the file directly. COPY is also faster than insert because the data is read (or written) as a single atomic transaction directly to or from the target table. The syntax for COPY is:

  COPY  table FROM 'path of text file' USING DELIMITERS '|';

Retrieving Rows with SELECT

  SELECT [ ALL | DISTINCT [ ON ( 
expression

 [, ...] ) ] ]
     
expression

 [ AS 
output_name

 ] [, ...]
    [ INTO [ TEMPORARY | TEMP ] [ TABLE ] 
new_table

 ]
    [ FROM 
from_item

 [, ...] ]
    [ WHERE 
condition

 ]
    [ GROUP BY 
expression

 [, ...] ]
    [ HAVING 
condition

 [, ...] ]
    [ { UNION | INTERSECT | EXCEPT [ALL] } 
select

 ]
    [ ORDER BY 
expression

 
               [ ASC | DESC | USING 
operator

 ] [, ...] ]
    [ FOR UPDATE [ OF 
tablename

 [, ...] ] ]
    LIMIT { 
count

 | ALL } [ { OFFSET | , } 
start

 ]

You can the following command to select all (*) columns from the book table:

   SELECT * FROM book;

We can add another relation to our database that contains data about the author of the books. Currently, we only have their author identifier numbers, which makes it easier to access several tables at one time.

The author table has an a_id, which is of type integer. It is a non-null unique primary key. It contains corresponding author numbers to the book table. The last and first name are the names of authors who wrote the books. Lastly, the email column is the e-mail address of each author. Create a table called author and insert the following values into it:

Table 4-5. Author Table Data Values

a_id

lastname

firstname

email

142

'Burgeous'

'Paulette'

'pburgeous@yahoo.com'

220

'Williams'

'Margary'

'mwilliams@goto.com'

231

'Alcott'

'Luoisa May'

'lalcott@bestbooks.com'

136

'Brown'

'Margaret W.'

'mwbrown@littletykes.com'

110

'Seuss'

''

'seuss@childprodigy.com'

After having created this table, it contains this:

Example 4-18. Author Table

 
 a_id | lastname |  firstname  |          email
------+----------+-------------+-------------------------
  136 | Brown    | Margaret W. | mwbrown@littletykes.com
  220 | Williams | Margary     | mwilliams@goto.com
  142 | Burgeous | Paulette    | pburgeous@yahoo.com
  110 | Seuss    |             | seuss@childprodigy.com
  231 | Alcott   | Luoisa May  | lalcott@bestbooks.com
(5 rows)   

Qualifying with the WHERE clause

A SQL SELECT statement consists of a target list and a qualification. The target list is a list of columns to be returned and the qualification is the part that specifies conditions which must be met in order to be returned. This is performed by the WHERE clause. It can be connected with the logical boolean operators: AND, OR, and NOT.

Example 4-19. Select using Boolean Operators

  SELECT title, author_id, cpright_date
    FROM book
   WHERE author_id =110
      OR cpright_date <= '1980-1-1';

This leads to the result:

            title            | author_id | cpright_date
-----------------------------+-----------+--------------
 The Cat in the Hat          |       110 | 1972-04-29
 Bartholomew and the Oobleck |       110 | 1997-02-01
 The Velveteen Rabbit        |       220 | 1958-02-06
(3 rows)

Joining tables with JOIN

As shown by the syntactical structure above, you can also view columns from different tables by performing a join. A join requires that both tables have an object identifier value for it to pull the two or more tables together and dynamically become one large table.

For instance, let's say we had another table with information about the publisher for each of the books. This table has the publisher name, the isbn of the books, and the suggested retail price for the books. If you want to create this table, use the CREATE TABLE and INSERT command. During creation of the isbn column, remember to specify it as not null unique primary key because we will use it to reference the book table. Also, when entering the data for publ_name and price, remember to include the data in quotes. The dollar sign will automatically be added for each instance of the suggested price.

Table 4-6. Publisher's Table Specifications

ISBN

publ_name

sugg_price

02567841

'ODG Books'

'12.99'

08741205

'Children's Books'

'19.99'

00214872

'Young Readers'

'8.00'

00602071

'ABC Books'

'4.55'

00676883

'Reading Rainbow'

'13.29'

00355184

'ABC Books'

'16.85'

If we do a select all on this table, it displays:

Example 4-20. Publisher Table

 
  isbn   |    publ_name     | sugg_price
---------+------------------+------------
 2567841 | ODG Books        |     $12.99
 8741205 | Children's Books |     $19.99
  602071 | ABC Books        |      $4.55
  676883 | Reading Rainbow  |     $13.29
  355184 | ABC Books        |     $16.85
  214872 | Young Readers    |      $6.00
(6 rows)

To join two separate tables and view the columns, use the SELECT clause:

   SELECT t.colname1, t2.colname1, t2.colname2
    FROM tableName t, tableName2 t2
    WHERE t1.colname1 = t2.colname1;

NoteJoin Syntax
 

We are accessing columns from two tables, so there must be a way to tell the parser which column is from which table. That is why the first line specifies that from table t, extract colname1; from table t2, extract colname1; and from table t2, extract colname2.

But the next line of command tells us that the actual table names are tableName and tableName2. Where did 't' and 't2' come from? Notice that it says 'FROM', then 'tableName', a space ' ', and finally a 't'. This specifies that the letter 't' represents tableName where ever it is used in the command statement. It is there so that you don't need to continuously re-type the table name. This is not part of the required syntax because this is an optional user specified field. If you decide to not use it, then you must type out the full table name, a period '.', then the column name where-ever you want to specify a column. The comma ',' separates the two table names.

The last line tells the parser how to connect the two tables together. Both tables have a unique colname1 that can be connected together. Again, to specify which column is from which table, you must list the table name, a period '.', then the column name.

Using our publish and book table, we can connect the two dynamically by:

Example 4-21. Joining Tables

  SELECT b.title, b.cpright_date, p.sugg_price
    FROM book b, publish p
   WHERE b.isbn = p.isbn;

Or you can also use this syntax:

  SELECT book.title, book.cpright_date, publish.sugg_price
    FROM book, publish
   WHERE book.isbn = publish.isbn; 

NoteNote
 

As shown by the syntax above, it is not necessary to spell out the table names if you specify in the from line

This returns:

            title            | cpright_date | sugg_price
-----------------------------+--------------+------------
 Franklin in the Dark        | 1986-09-22   |      $8.00
 Bartholomew and the Oobleck | 1997-02-01   |     $16.85
 Goodnight Moon              | 1991-04-01   |      $4.55
 The Velveteen Rabbit        | 1958-02-06   |     $13.29
 Little Women                | 1980-02-15   |     $12.99
 The Cat in the Hat          | 1972-04-29   |     $19.99
(6 rows)      

If you want to connect three or more tables, then list all of the table names in the FROM line and use equal signs to pull the separate tables together in the WHERE line. Like this:

   SELECT t.colname1, t2.colname1, t3.colname1
     FROM tableName t, tableName2 t2, tableName3 t3
    WHERE t1.colname1 = t2.colname1 AND t2.colname1=t3.colname1;

To perform a join, we must find the keys, or indexes within each table and connect them together through those keys. We can display the product of the columns when joined together, or display only a few columns. For example:

  SELECT * 
    FROM author a, book b, publish p
   WHERE a.a_id = b.author_id 
     AND b.isbn = p.isbn; 

The output of the above statement is the product of all tables specified in the FROM line.

Grouping rows with GROUP BY

When a Group By is performed, the parser logically collects rows that have the same value for a specified column and then it groups them together. The following groups all publ_names together and return the sum of the number of publishers with that name.

Example 4-22. Using the GROUP BY Clause

   SELECT publ_name, COUNT(*)
     FROM publish
 GROUP BY publ_name;

    publ_name     | count
------------------+-------
 ABC Books        |     2
 Children's Books |     1
 ODG Books        |     1
 Reading Rainbow  |     1
 Young Readers    |     1
(5 rows)
    

Hiding and sorting with DISTINCT and ORDER BY

You can revise the way a table is displayed by using DISTINCT or ORDER BY. DISTINCT returns only one instance of a duplicated value (usually it is the first instance of that value, not the largest or smallest instance). Only columns specified in the query are distinct. NON-selected columns are not compared. ORDER BY will return rows according to descending (DESC) or ascending (ASC - this is the default) order.

In the author column, we have two books written by Dr.Seuss. If we wanted to know who all of the authors are, without any author names repeated, then we can use distinct:

Example 4-23. Using the DISTINCT Clause

   SELECT DISTINCT author_id FROM book;

This statement says to look for distinct instances of the author_id column from the book table and it returns the following:

 author_id
-----------
       110
       136
       142
       220
       231
(5 rows)      

The publisher names are unordered, so we can order them according to ASC order. If you wanted to see an ordered list of the publisher names along with the ISBN numbers of books they have published, use the command:

Example 4-24. Using ORDER BY

   SELECT publ_name, isbn
     FROM publish
    ORDER BY publ_name;

The result of this query outputs:

    publ_name     |  isbn
------------------+---------
 ABC Books        |  602071
 ABC Books        |  355184
 Children's Books | 8741205
 ODG Books        | 2567841
 Reading Rainbow  |  676883
 Young Readers    |  214872
(6 rows)
   

Remember that ascending (ASC) order is the set default for order by. If you want to use descending (DESC) order, the syntax is:

   SELECT colname1, colname2, ...etc.
    FROM tablename
    ORDER BY colname DESC;  
   

Setting row range with LIMIT and OFFSET

LIMIT and OFFSET provide the capability to retrieve only a portion of the generated result from the query. If a limit count is specified, the number of rows returned correspond to that limit count. If an offset count is given, then that number is the amount of rows skipped before they begin to be returned. The syntax for limit and offset are:

  SELECT select_list FROM table_expression 
  [ORDER BY] sort_spec][LIMIT {number | ALL}]
  [OFFSET number]

NoteNote
 

When using LIMIT, an ORDER BY clause can be extremely beneficial to obtaining relevant results, because the sorting occurs before the limiting. In other words, the ORDER BY clause will help you prepare the most appropriate range of data to be limited before the LIMIT clause is processed.

This uses the book table to display only 3 rows of titles and author identification numbers ordered according to the title names:

Example 4-25. Using the LIMIT Clause

  SELECT title, author_id
    FROM book 
   ORDER BY title
   LIMIT 3;

The output of this query is:

            title            | author_id
-----------------------------+-----------
 Bartholomew and the Oobleck |       110
 Franklin in the Dark        |       142
 Goodnight Moon              |       136
(3 rows) 

The following limit the number of rows returned, and start from an offset of 2. It will return rows formatted by the date and display the author identification numbers and copyright dates.

Example 4-26. The LIMIT and OFFSET

  SELECT author_id, cpright_date
    FROM book 
ORDER BY cpright_date
   LIMIT 4
  OFFSET 2;       
  

The resulting query looks like this:

 author_id | cpright_date
-----------+--------------
       231 | 1980-02-15
       142 | 1986-09-22
       136 | 1991-04-01
       110 | 1997-02-01y
(4 rows)

Using conditional expressions

The syntax for the case statement is as follows:

   CASE WHEN condition THEN action
     [WHEN ...]
     [ELSE otheraction]
   END

The CASE statement is similar to the if/then/else statements in other languages. The condition specified returns a boolean result. If the evaluated result is true, then it performs the action. If false, then it can performs an else. When a result is not provided, it returns NULL.

Using the publish table, we can specify it to return suggested prices where if under 10, then it is low priced. If the suggested price is between $11 and $15, then it is moderately priced. Anything over that range is high priced. The statement in Example 4-27 returns the prices and publisher names:

Example 4-27. Using Case Statements

   SELECT publ_name, sugg_price,
     CASE WHEN sugg_price <='10' THEN 'low priced'
     WHEN sugg_price <='15' THEN 'moderately priced'
     ELSE 'expensive' 
     END
    FROM publish
   ORDER BY publ_name;  

The result is:

REWORK THIS EXAMPLE ENTIRELY. :)
(6 rows)     

Using sub-queries

Sub-queries are most often used to test the search condition given in a where clause. We will refer to the book and author tables for the examples below. The author table contains these data values:

 a_id | lastname |  firstname  |          email
------+----------+-------------+-------------------------
  136 | Brown    | Margaret W. | mwbrown@littletykes.com
  220 | Williams | Margary     | mwilliams@goto.com
  142 | Burgeous | Paulette    | pburgeous@yahoo.com
  110 | Seuss    |             | seuss@childprodigy.com
  231 | Alcott   | Luoisa May  | lalcott@bestbooks.com
(5 rows)

This simple example uses the equal sign to find only one instance (one row) of an author's first and last name where the author identification number is the highest.

Example 4-28. Sub-query using Equal

  SELECT lastname, firstname
    FROM author
   WHERE a_id =
        (SELECT MAX(author_id)
           FROM book);

This query returns:

 lastname | firstname
----------+------------
 Alcott   | Luoisa May
(1 row)

NoteUsing the Equivalence Operator with Aggregates
 

You may only use the equivalence (=) operator against one row. This is because a value can only be equal to another value, not to several other values. If the sub-select specified returns more than one row, an error will be returned requesting that the referenced values be operated upon by an aggregate function, or that the column in question be grouped by the GROUP BY clause.

A sub-query also gives more functionality to the idea of performing joins. This example looks for titles that begin with "The" from the book table. It also searches for the author idenfication numbers that match the specified condition. For each author_id returned, a lastname and firstname is taken from the author table and consequently displayed.

Example 4-29. Sub-queries Using IN

  SELECT lastname, firstname
    FROM author
   WHERE a_id IN
        (SELECT author_id 
           FROM book
          WHERE title
           LIKE 'The %');

The result of this sub-query is:

 lastname | firstname
----------+------------
 Williams | Margary
 Alcott   | Luoisa May
 Seuss    |
(3 rows)  

The keyword IN specified in the where condition tells the parser to search for that condition in the set of the sub-select statement. Because it is comparing to a set of values, it can be used to handle and return more than one row. On the other hand, the equal (=) to operator can only handle and return one row.

An example of a nested select statement using the LIKE operator is this:

Example 4-30. A Nested Select

  SELECT sugg_price
    FROM publish
   WHERE isbn IN
         (SELECT isbn
            FROM book
           WHERE author_id IN
                 (SELECT a_id
                    FROM author
                   WHERE lastname 
                    LIKE 'B%'));
   

This select statement searches for the author's last name, which starts with a B, and then refers to the book table using that author's identification number. It then uses that author id to look for isbn's of books that the author has written. Using the isbn, the parser then compares to the publish table and extracts the suggested price for the isbn number. In essence, it is asking what is the suggested price for books written by authors whose last name starts with a capital B.

The result is:

   sugg_price
------------
      $8.00
      $4.55
(2 rows) 

NoteDesigning Sub-selects
 

When deciphering a sub-select statement, or in the process of writing one, the best way is to read or write from the bottom, then work your way up. The reason you want to read/write statements in that particular order is because each outer select statement narrows down the possible true values. If you read from top to bottom, you may be expecting the wrong result.

Modifying Rows with UPDATE

You can also change the value of data within existing rows using the UPDATE command. Suppose the values for the suggested price column from the publish table needs to have updated prices. This adds $2.00 to every book price:

Example 4-31. Using UPDATE

  UPDATE publish
     SET sugg_price = sugg_price + '2.00';

The publish table now contains these prices:

  isbn   |    publ_name     | sugg_price
---------+------------------+------------
  214872 | Young Readers    |      $8.00
  355184 | ABC Books        |     $18.85
  602071 | DEF Books        |      $6.55
  676883 | Reading Rainbow  |     $15.29
 2567841 | ODG Books        |     $14.99
 8741205 | Children's Books |     $21.99
(6 rows)

The syntax for UPDATE is:

  UPDATE 
tablename

     SET 
expression

   WHERE 
condition; 
   

The update command can change the values for the entire column, and also allow specified values that meet certain conditions to be changed. The values that meet the condition in the WHERE statement will be changed to the SET value.

You can update more than one column values at a time by using a comma delimited list after specifying the SET clause. For instance, this modifies the suggested price and publisher name:

Example 4-32. Using UPDATE on Several Columns

 
           UPDATE publish 
              SET sugg_price = '9.25', 
                  publ_name = 'DEF Books' 
            WHERE isbn = 602071;
   

The resulting table is:

 
  isbn   |    publ_name     | sugg_price
---------+------------------+------------
  214872 | Young Readers    |      $8.00
  355184 | ABC Books        |     $18.85
  602071 | DEF Books        |      $9.25
  676883 | Reading Rainbow  |     $15.29
 2567841 | ODG Books        |     $14.99
 8741205 | Children's Books |     $21.99
(6 rows)

Removing Rows with DELETE

To remove a row, or all rows from a table, use the DELETE command. The syntax is:

  DELETE FROM 
tablename

        WHERE 
condition;

For instance, if the Book Town store does not carry books published by ABC Books and you want to delete it from the table, then use the command:

Example 4-33. Using DELETE

  DELETE FROM publish
        WHERE publ_name='ABC Books';  

The INSERT INTO command can be used to re-insert the data that you have just deleted from your publish table.

WarningCaution with DELETE
 

If a condition is not specified, then the DELETE command removes ALL rows within that table.