INTRODUCTION
In today's PostgreSQL lesson, we will go through examples of the basic commands that allow us to request and view the data of a table in the database, as well as how to retrieve data from individual columns. The main command we will use to send queries to the database is SELECT.
WHAT DOES THE
NORTHWIND DATABASE DESCRIBE?
Before we start with the commands, let's briefly describe the structure of the Northwind database that we installed in the previous section.
The Northwind database contains all sales transactions that have taken place between the company and its customers, as well as purchase transactions between the company and its suppliers. Like any Relational Database Management System (RDBMS), Northwind allows data to be organized into logical units called tables. Additionally, relationships exist between the tables, enabling us to retrieve data from multiple tables simultaneously more easily or to enforce constraints when inserting data into the database. The relationships between the tables in the Northwind database are illustrated in the following diagram.
For now, we accept the fact that there is a database (Northwind) that contains tables with data. This will be our starting point—learning the commands that allow us to view table data filtered by specific conditions.
To retrieve data from the database, we use Structured Query Language (SQL). Since SQL includes many commands for different purposes, these commands are categorized into five groups.
Data Definition Language (DDL): This category includes commands that allow us to modify or change the structure of the database. For example, the CREATE TABLE command is used to create a new table in the database.
Data Query Language (DQL): This category includes commands that allow us to retrieve data from the database. The SELECT command belongs to this category.
Data Manipulation Language (DML): This category includes commands such as INSERT, UPDATE, and DELETE, which allow us to modify existing data in tables, add new data, or even delete data from the database.
Data Control Language (DCL): This category includes commands used to grant or revoke user access permissions to the database. Commands such as GRANT and REVOKE belong to this category.
Transaction Control Commands: This category includes commands that manage multiple data operations as a single group. Commands like COMMIT and ROLLBACK belong to this category.
SIMPLE
SELECT STATEMENT
Returning back to our scenario and specifically to the Northwind database, we would like to start by viewing all the data in the customer table.
How can we do this using SQL?
Using the pgAdmin Web interface of PostgreSQL, we can easily retrieve data with the help of the Query Tool.
To open it, click on the database icon with the arrow at the top left of the screen.
Now, inside the Query Tool, which has opened on the right side of your screen, you can write your SQL queries.
In PostgreSQL, the SELECT statement is used to send a query to the database and retrieve data from a single table or multiple tables.
Let's write a simple SELECT query inside the Query Tool, execute it, and then explain what we have achieved with it.
In SQL, after some practice with the commands, you'll be able to translate your words—which essentially represent your requirements—into SQL commands. In this example, the requirement given to us is as follows: "We want to see all the data contained in the customers table."
Now, we need to translate these words into SQL commands.
Since we are requesting data from the database without modifying it, the command we need is SELECT, which belongs to the DQL (Data Query Language) category.
So, to retrieve data from the database, we always start our command with SELECT.
Next, we decide how many columns we want to view:
- To select all columns, we use the asterisk (
*) symbol. - Later, we’ll see how to select specific columns individually.
After selecting the columns, the next step is to specify which table we want to retrieve the data from.
In this example, we are requesting data from the customers table. After selecting the columns, the next step is to specify which table we want to retrieve the data from. In this example, we are requesting data from the customers table. Please note that he result displayed in the screenshot is often a sample of the data, not the entire dataset. You can scroll down to see the entire set of the result data.
Every complete SQL query ends with a semicolon (;).
It’s easy to forget, but it’s very important for the proper execution of the query. The semicolon indicates the end of the statement, allowing the database to process and execute the command correctly.
Although SELECT * is very useful, you may want to view only specific columns from a table instead of all columns.
For example, the scenario might be something like:
"We want to find the company name, the city, and the country of the suppliers."
In this case, you would specify exactly which columns you want to retrieve rather than using * for all columns.
The table for the suppliers is called suppliers. But how do we know the names of the columns inside the table to select them?
To find this information, you can click on the "Columns" option under the suppliers table. This will display all the column names in the suppliers table, allowing you to know exactly what to select.
Since we are requesting specific columns from the suppliers table, instead of using the asterisk (*), we simply write the column names. The order in which you write the column names will determine the order in which the columns are displayed on your screen.
To see only the countries where the company has customers, a logical query would be to select the country column from the customers table. You might think of writing a simple query like this:
DISTINCT
Although we are getting the information we want, we would need to search to find how many different countries there are in the country column. In the result we received, there are country names that appear multiple times. In each country, we may have more than one customer, and this is logical. However, we want to see the total number of countries without the duplicate or triplicate entries. How can we achieve this?
To get unique values as a result, we need to add the word DISTINCT before the column name. The word DISTINCT automatically filters the data and shows each different value of the column only once. In other words, the result will be the unique values in the column.
The word DISTINCT applies only to a single column. This means that if we wanted to see both the cities and countries where the company has customers, but at the same time, we wanted to see unique values for the cities, the query would be as follows:
COUNT( )
If we now want to know not only the countries but also the total number of distinct countries where the company has customers, we need to add another word called COUNT.
Essentially, COUNT() is a function that accepts a column as a parameter inside the parentheses. It counts the number of non-null values in that column.
Let's first see how it applies to the city column:
The number that appears includes all the countries stored in the table, along with their duplicate and triplicate values. If we want to see how many unique countries the company has customers in, we can combine COUNT with DISTINCT.
By using COUNT(DISTINCT column_name), we can count the number of unique values in a specific column, removing any duplicates.
If we now want to find the total number of rows (records) in a table, we can use COUNT(*). This is commonly used when we want to find the actual number of rows of data inside a table.
Of course, with SELECT we can do much more, such as performing arithmetic operations.
Remember, SELECT does not affect the data in our tables—it only retrieves and displays the results. This means we can try various hypothetical scenarios and see their results on our screen without modifying the actual data.
For example, if we want to perform a simple arithmetic operation like addition, subtraction, multiplication, or division on a column, we can easily do that.
To find out how many days have passed from when the company received an order until it was shipped, we can calculate the difference between the order date and the shipped date from the Orders table in the Northwind database.
Since the Orders table contains the columns order_date and shipped_date, but does not directly have the difference between these two dates, we can create a new column by subtracting order_date from shipped_date. Additionally, to know which customer made each order, we will include the customer_id column in our query.
Since the second column is the result of subtracting two columns, the default name for it might seem a bit unclear. We can give it an alias using the AS keyword, followed by the desired name for the new column. This will make the output more readable and meaningful.
To avoid mistakes when typing column names, especially if they are long or complicated, you can simply drag and drop the column names directly from the Columns section in the Query Editor. This will automatically insert the correct column names into your query, ensuring accuracy and saving you time.
Supporting educational resources, like a blog offering free online courses, through donations is a great way to help them grow and expand their reach. Contributions can provide the platform with the ability to offer more content, tools, and better services to users.
full-width


0 Comments
What do you think about Ground of Code?