SQL (structured query language) is the de facto standard language for interrogating and managing relational databases. Originally developed by IBM in 1974, it was standardized by ANSI in 1986 and subsequently by ISO. The latest comprehensive version is SQL 2003, with some parts being modified in 2006. SQL is employed in a large number of database management systems, including SQL Server, mySQL, postgreSQL, and Oracle.
Basic operations
SQL is built upon four types of operation (called queries in SQL terminology):
Selecting data
The select operation is the most important one used in SQL, used to obtain different sorts of information. Consider a database that has a table called ‘data’ containing information about employment. The rows are the records in the database and the columns are the fields. In this example the fields are ‘name,’ ‘occupation,’ ‘salary,’ and ‘city.’ A simple select query is:
This query retrieves all fields (‘*’) from all rows in the table called ‘data’: that is, it retrieves the whole table. The fields to be retrieved can be limited:
This query retrieves just the ‘name’ and the ‘salary’ columns for all records. It is also possible to use conditional queries limiting the information obtained for both records and fields, for example:
This query retrieves only those rows where the field called ‘occupation’ contains the value ‘dentist’; and, from those rows, it retrieves only the fields ‘name’ and ‘salary.’
Summarizing data
Data from multiple rows can be summarized (aggregated, in SQL terminology) and calculations can be performed on numerical data:
This query retrieves all records and divides them into groups according to the value of the field ‘occupation.’ It calculates the mean value of the ‘salary’ field for all rows in each group and returns one row for each group containing two fields: the grouping field, ‘occupation’; and the calculated mean salary of the group. In other words, it produces a list of occupations and the average salary for each occupation. A further restriction to workers in a particular city can be applied using a ‘where’ clause:
Multiple tables
The real power of SQL lies in the ‘from’ clause, which can retrieve data from several tables:
This query assumes that the tables ‘data’ and ‘cities’ both have a column called ‘city’ that can act as a cross-reference (a foreign key, in SQL terminology). It first combines—joins, in SQL terminology—these two tables to construct a pseudotable. In this, one row is generated for every possible combination of rows from the two tables where the values of the ‘city’ fields are equal; each such row contains all fields from both tables. This pseudotable is then used to perform the aggregation in a similar fashion to the example given above.
Changing data
Insert queries
Insert queries work in either of two ways. Rows can be selected from one table or joined set of tables and inserted into another:
This query adds the results of the example given above to the table ‘summaryIllinois.’ Alternatively, insert queries can add one row of arbitrary values to a table:
This query adds one row to the table ‘data,’ with the three fields named receiving the given values.
Update queries
Update queries use values generated by SQL expressions to alter the values of specified fields in specified rows in a specified table:
This query updates the ‘data’ table to reflect a 10% salary increase for all dentists in Illinois.
Delete queries
Delete queries work in a similar way to update queries:
This query deletes all rows from the ‘data’ table relating to doctors in New York.
Note
This outline of SQL has only scratched the surface of the language. The SQL standard is many hundreds of pages long and there are also several standard extensions to increase functionality (e.g. the use of Java within SQL). Moreover individual commercial database developers usually have their own versions of SQL.