Learn SQL
Learn how to manage large datasets and analyze real data
Information is everywhere around us, and knowing how to use it effectively is a important skill. SQL (Structured Query Language) is a language used in data science, analytics, and engineering to handle and analyze data. By learning SQL basics, you can manage large datasets, extract meaningful insights, and make better decisions in various fields, even if you’re not in a technical role. This article will guide you through the importance of SQL, helping you understand how to work with data and perform important operations like creating tables, running queries, and using aggregate functions.

What we are going to discuss:
- Data Manipulation: Creating, updating, deleting, and altering tables involves several commands. To create a new table, you can use the
CREATE TABLEcommand, providing the table name and column details. Use theINSERT INTOcommand to insert data into a table, specifying the target table name and the data values. Modify existing tables with theALTER TABLEcommand, adding new columns or changing properties. Delete records with theDELETEcommand, specifying the table name and optional filters. Update records with theUPDATEcommand, setting new values for specific columns and applying filters as needed. - Queries: Querying data enables analysis and decision making. Combine conditions with logical operators such as
AND,OR, andLIKE. Alias columns or tables with theASkeyword, renaming items temporarily for readability. Sort results using theORDER BYclause withASC(ascending) orDESC(descending) keywords. Filter ranges of values with theBETWEENoperator, while comparing patterns with theLIKEoperator. Select unique entries with theDISTINCTkeyword, and control displayed records with theLIMITkeyword. Finally, handle missing values usingIS NULLandIS NOT NULLcomparisons within theWHEREclause. - Aggregate Functions: Analyze your dataset by summarizing data points. Count elements with the
COUNT()function, calculating minimums withMIN(), maximums withMAX(), averages withAVG(), and totals withSUM(). Group related records together using theGROUP BYclause, optionally ordering these subsets with theORDER BYclause. Narrow down groups meeting specific criteria with theHAVINGclause, allowing advanced conditional aggregation. - Working with Multiple Tables: Work efficiently with complex data structures involving multiple interconnected tables. Perform inner joins with the
JOINclause, returning results satisfying the specified join condition. Utilize cross joins via theCROSS JOINclause, generating all possible combinations across two tables. Merge results from separate queries with theUNIONclause, eliminating duplicate records automatically. Lastly, store intermediate results using theWITHclause, enabling reusable subqueries throughout your code.
1. Manipulation
Column Constraints
When you create a table in SQL, you can set certain rules for the columns, known as constraints. These rules ensure that the data in your table follows specific guidelines.
PRIMARY KEYconstraint can be used to uniquely identify the row.UNIQUEcolumns have a different value for every row.NOT NULLThis column must always have a value; it can’t be empty.DEFAULTIf you don’t provide a value for this column, it will automatically use a default value that you specify.
There can be only one PRIMARY KEY column per table and multiple UNIQUE columns.
CREATE TABLE students (
student_id INT PRIMARY KEY,
student_name VARCHAR(100) UNIQUE,
student_grade INT NOT NULL,
student_age INT DEFAULT 15
);CREATE TABLE Statement
The CREATE TABLE statement is how you make a new table in your database. You define the table’s name and its columns.
CREATE TABLE employees (
employee_id INT,
first_name VARCHAR(50),
last_name VARCHAR(50),
hire_date DATE
);INSERT Statement
The INSERT INTO statement is used to add a new record (row) to a table.
It has two forms as shown:
- Insert into columns in order.
- Insert into columns by name.
-- Insert values in order:
INSERT INTO employees
VALUES (101, 'John', 'Doe', '2024-01-15');-- Insert values by column name:
INSERT INTO employees (employee_id, last_name, first_name)
VALUES (102, 'Smith', 'Jane');ALTER TABLE Statement
The ALTER TABLE statement is used to modify the columns of an existing table. When combined with the ADD COLUMN clause, it is used to add a new column.
ALTER TABLE employees
ADD department VARCHAR(50);DELETE Statement
The DELETE statement is used to remove rows from a table. You can specify which rows to delete using the WHERE clause. If you don’t use WHERE, all rows in the table will be deleted.
DELETE FROM employees
WHERE employee_id = 101;UPDATE Statement
The UPDATE statement is used to edit records (rows) in a table. It includes a SET clause that indicates the column to edit and a WHERE clause for specifying the record(s).
UPDATE employees
SET last_name = 'Brown'
WHERE employee_id = 102;2. Queries
AND Operator
The AND operator lets you combine multiple conditions in a query. All conditions must be true for the rows to be included in the results. For example, to find cars that are red and were manufactured in 2015 or later, you would write:
SELECT model
FROM vehicles
WHERE color = 'red'
AND year >= 2015;AS Clause
You can use the AS clause to give columns or tables a temporary name (alias) in your result set. For example, if you want to display the title column as "book_title," you can do this:
SELECT title AS 'book_title'
FROM library;OR Operator
The OR operator allows you to include rows that meet at least one of several conditions. For example, to find customers from either Florida (FL) or Texas (TX), you would use:
SELECT customer_name
FROM orders
WHERE state = 'FL'
OR state = 'TX';% Wildcard
The % wildcard is used with the LIKE operator to match patterns. It can stand for any sequence of characters. For example, to find products that start with "Pro," you would use:
SELECT product_name
FROM products
WHERE product_name LIKE 'Pro%';SELECT Statement
The SELECT statement retrieves specific columns from a table. For example, this query will return the name and year columns from the "movies" table:
SELECT name, year
FROM movies;_ Wildcard
The _ wildcard is used with the LIKE operator to match any single character. For instance, to find books where the title starts with "H" followed by four characters and ending with "y," you would write:
SELECT title
FROM books
WHERE title LIKE 'H____y';ORDER BY Clause
The ORDER BY clause can be used to sort the result set by a particular column either alphabetically or numerically. It can be ordered in two ways:
DESCis a keyword used to sort the results in descending order.ASCis a keyword used to sort the results in ascending order (default).
SELECT *
FROM contacts
ORDER BY birth_date DESC;LIKE Operator
The LIKE operator can be used inside of a WHERE clause to match a specified pattern. The given query will match any movie that begins with Star in its title.
SELECT name
FROM movies
WHERE name LIKE 'Star%';DISTINCT Clause
Unique values of a column can be selected using a DISTINCT query. For a table contact_details having five rows in which the city column contains Chicago, Madison, Boston, Madison, and Denver, the given query would return:
ChicagoMadisonBostonDenver
SELECT DISTINCT city
FROM contact_details;BETWEEN Operator
The BETWEEN operator can be used to filter by a range of values. The range of values can be text, numbers, or date data. The given query will match any movie made between the years 1980 and 1990, inclusive.
SELECT *
FROM movies
WHERE year BETWEEN 1980 AND 1990;LIMIT Clause
The LIMIT clause is used to narrow, or limit, a result set to the specified number of rows. The given query will limit the result set to 5 rows.
SELECT *
FROM movies
LIMIT 5;NULL Values
Column values can be NULL, or have no value. These records can be matched (or not matched) using the IS NULL and IS NOT NULL operators in combination with the WHERE clause. The given query will match all addresses where the address has a value or is not NULL.
SELECT address
FROM records
WHERE address IS NOT NULL;WHERE Clause
The WHERE clause is used to filter records (rows) that match a certain condition. The given query will select all records where the pub_year equals 2017.
SELECT title
FROM library
WHERE pub_year = 2017;3. Aggregate Functions
Column References
The GROUP BY and ORDER BY clauses can reference the selected columns by number in which they appear in the SELECT statement. The example query will count the number of movies per rating, and will:
GROUP BYcolumn2(rating)ORDER BYcolumn1(total_movies)
SELECT COUNT(*) AS 'total_movies',
rating
FROM movies
GROUP BY 2
ORDER BY 1;SUM() Aggregate Function
The SUM() aggregate function takes the name of a column as an argument and returns the sum of all the value in that column.
SELECT SUM(salary)
FROM salary_disbursement;MAX() Aggregate Function
The MAX() aggregate function takes the name of a column as an argument and returns the largest value in a column. The given query will return the largest value from the amount column.
SELECT MAX(amount)
FROM transactions;COUNT() Aggregate Function
The COUNT() aggregate function returns the total number of rows that match the specified criteria. For instance, to find the total number of employees who have less than 5 years of experience, the given query can be used.
Note: A column name of the table can also be used instead of *. Unlike COUNT(*), this variation COUNT(column) will not count NULL values in that column.
SELECT COUNT(*)
FROM employees
WHERE experience < 5;GROUP BY Clause
The GROUP BY clause will group records in a result set by identical values in one or more columns. It is often used in combination with aggregate functions to query information of similar records. The GROUP BY clause can come after FROM or WHERE but must come before any ORDER BY or LIMIT clause.
The given query will count the number of movies per rating.
SELECT rating,
COUNT(*)
FROM movies
GROUP BY rating;MIN() Aggregate Function
The MIN() aggregate function returns the smallest value in a column. For instance, to find the smallest value of the amount column from the table named transactions, the given query can be used.
SELECT MIN(amount)
FROM transactions;AVG() Aggregate Function
The AVG() aggregate function returns the average value in a column. For instance, to find the average salary for the employees who have less than 5 years of experience, the given query can be used.
SELECT AVG(salary)
FROM employees
WHERE experience < 5;HAVING Clause
The HAVING clause is used to further filter the result set groups provided by the GROUP BY clause. HAVING is often used with aggregate functions to filter the result set groups based on an aggregate property. The given query will select only the records (rows) from only years where more than 5 movies were released per year.
The HAVING clause must always come after a GROUP BY clause but must come before any ORDER BY or LIMIT clause.
SELECT year,
COUNT(*)
FROM movies
GROUP BY year
HAVING COUNT(*) > 5;4. Multiple Tables
Outer Join
An outer join will combine rows from different tables even if the join condition is not met. In a LEFT JOIN, every row in the left table is returned in the result set, and if the join condition is not met, then NULL values are used to fill in the columns from the right table.
SELECT column_name(s)
FROM table1
LEFT JOIN table2
ON table1.column_name = table2.column_name;WITH Clause
The WITH clause stores the result of a query in a temporary table (temporary_movies) using an alias.
Multiple temporary tables can be defined with one instance of the WITH keyword.
WITH temporary_movies AS (
SELECT *
FROM movies
)
SELECT *
FROM temporary_movies
WHERE year BETWEEN 2000 AND 2020;UNION Clause
The UNION clause is used to combine results that appear from multiple SELECT statements and filter duplicates.
For example, given a first_names table with a column name containing rows of data “James” and “Hermione”, and a last_names table with a column name containing rows of data “James”, “Hermione” and “Cassidy”, the result of this query would contain three names: “Cassidy”, “James”, and “Hermione”.
SELECT name
FROM first_names
UNION
SELECT name
FROM last_namesCROSS JOIN Clause
The CROSS JOIN clause is used to combine each row from one table with each row from another in the result set. This JOIN is helpful for creating all possible combinations for the records (rows) in two tables.
The given query will select the shirt_color and pants_color columns from the result set, which will contain all combinations of combining the rows in the shirts and pants tables. If there are 3 different shirt colors in the shirts table and 5 different pants colors in the pants table then the result set will contain 3 x 5 = 15 rows.
SELECT shirts.shirt_color,
pants.pants_color
FROM shirts
CROSS JOIN pants;Inner Join
The JOIN clause allows for the return of results from more than one table by joining them together with other results based on common column values specified using an ON clause. INNER JOIN is the default JOIN and it will only return results matching the condition specified by ON.
SELECT *
FROM books
JOIN authors
ON books.author_id = authors.id;References
- https://www.codecademy.com/learn/learn-sql/modules/learn-sql-manipulation/cheatsheet
- https://www.youtube.com/watch?v=p3qvj9hO_Bo
Thanks for reading✨ If you like the article make sure to:






