Everything About SQL- Get Started - CodeProject
Topics to be Covered
- What is a database?
- What does Relational mean?
- A small example
- SQL Statement
- Naming Things
- Creating Things
Everything about SELECT Statement
- Limiting Result Set
Before I begin, this article is intended for each and every beginner, may it be a Software Developer or a Quality Analyst, this is a must know for every software engineer. Some might say we have Database administrator, then why should we, but I feel being a developer, one should have all round knowledge of what's going on and handle everything.
Here in this article, I will be explain each and every concept in detail to get started with SQL (Structured Query Language).
As mentioned above, SQL stands for Structured Query Language. It is a special-purpose programming language. Its purpose is to manipulate relational databases. Relational databases are one which have detailed information about all the tables in a formally organized manner. The best part here is it contains both the data-definition syntax as well as data-manipulation syntax.
What is a Database?
Now the diagram describes what it is. Database is simply a container that helps organize data in a format or constructive manner. In a database, data is stored in a rational way. Now, let's take an example that would make is help understand it better. For instance, our contact list in cell phones. If we think a bit into it, then we realize the contact list is a database with data in it in an organized manner. Then we sort by the letters search the names from the list. Thus, we perform query execution on the list to retrieve the records we want. This is the flexibility a simple database provides us. It makes it easier to:
- Query Data
- Update Data
- Insert New Data
- Delete Data
There are basically three types of databases available:
We will be discussing only on the Relational Database as that is commonly used. I have provided the links to the other types of databases for reference.
What is Relational?
The dictionary meaning to this is “Things that relate each other in a way”. Here, the meaning is keep its consistency. Relational model is the database model on which SQL is based. It is thus a way to describe data and the relationship between those data entities.
In a relational database, data is stored in a construct called
Table. Now what is a
- A table is a collection of columns that define the table properties with specific datatype associated, with specific size, thus restricting the size of each record to be inserted into the table.
- Every column can be made required or not required (using the check box available for
- Rows can be retrieved by asking questions about the data. Like “You would need all the rows of data that have a column (surname) value as having value ‘
- An essential part of a table is “
Key”. Every table must have a key associated with any column to uniquely identify a table. Usually, a table is associated with a key called Primary Key. (We will be discussing about keys in the upcoming modules).
- If more than one table uses the same primary key (one table-Primary key and other named foreign key), then the tables can be merged or joined to give different set of records set.
A Small Example
Before starting with the, I hope the readers have a view of the SSMS (SQL Server Management Studio). Here is how the SSMS looks like:
Now let's ask questions to ourselves. Why do we at all need database with multiple tables, why not a single table! let's check with an example why?
Suppose we have a
User table which would have columns named
PhNo. Now suppose we have only this table and there is a requirement in the future that a user may have more than one Phone number or more than one Email. Just think how we would approch this, as we cannot store multiple values into a column. Thus, does not this give us an indication to have another
table! Just see the difference below:
In the above figures, you can see the difference clearly, using the first table will cause real problem and a lot of redundant data/records in the table which anyway violates the Normalization rules (Normalization Rules). Thus, creating the second table which would be having a foreign key relationship with the
User table, can now store as many phone numbers as a user can have.
- A SQL statement is framed from the combination of actionable set of valid words out of which some are SSMS defined and others are user defined.
- A valid SQL statement would be if a statement/query ends with a semicolon (
;), it is not like that it's mandatory to add a semicolon at the end. But it is a part of the ANSI standard, so it is advisable to use the semicolon at the end of the statement.
- SQL is not case-sensitive. Usually, the queries/statements are written in Caps(Capitals). Sometimes the user-defined functions are in lower case. But obviously, the standard needs to be followed for better maintenance and to maintain consistency.
- In SQL statement, we can put comments. For single line comments, “
–” is used as a predecessor for the query as single line comment where as “
/**/”, is used for the query as multiple line comments.
- SQL statements start with commands. Comparing with the English language, the commands here are similar to the verbs. For example,
SELECT(Command/keyword in SQL)
In generic terms,
SELECTwould go like below:
SELECT VALUES FROM TABLE NAME;
After the command comes the part that is highly dependent on the commands that is to define the set of result/records we would expect from the query like a list of all values of the columns (generally, the wildcard symbol “
*” asterix is used to select all columns in table). Then is the
FROMclause which is used to fetch records / column values from which
table(as it explains itself).
This might be a bit controversial, as there is nothing called the best way to do. So we follow what is followed the most, or follow the conventions already being used. This would differ from person to person. But the most important factor to keep in mind is to have consistency through out. So, that when we see the database, the names of the tables define themselves. Some of the rules I am listing below:
There is a whole set of SQL commands which are required to create and modify the structures/records in the database.
CREATE DATABASE DemoDatabase; USE DemoDatabase; CREATE Table DemoDatabase.DemoTable(…);
Here, I have only shown the structures, we will be discussing more of that ahead.
Datatypes are very vital when we will be creating the tables in a database. Each column in a database has restriction on the datatype of the data that can be stored in the table.
Let's have a look at the datatypes:
||Can hold ‘N’ number of characters which are specified implicitly, to be exact statically that is the number of characters to be accepted in the value is set at the time of column creation|
||Can hold ‘N’ characters which is set dynamically, that is if
||Stores Hexadecimal data, fixed length with a length of n bytes, where n varies from 1 through 8,000. Used for images and other sort of blobs.|
||As the name suggests, it accepts value of varying length. When size is not specified, then default value is taken as 1.|
||Its length is 2 bytes and ranges from -2^15 (-32,768) to 2^15-1 (32,767)|
||Its length is 4 bytes and ranges from -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)|
||Its length is 8 bytes and ranges from -2^63 (-9,223,372,036,854,775,808) to 2^63-1 (9,223,372,036,854,775,807)|
||Its length is 1 byte and ranges from 0-255. Can be used for values with Yes/No or the enumeration types which are limited to fewer values|
||Stores only time values, only
||Stores both date & time. This is when date is combined with time of a day with the seconds. Format is just concatenating both date format and time format|
These are some of the datatypes, I have listed out. The important thing here to keep in mind is, while creating the table, we never actually keep in mind. Thus the
datatype plays a very important role if we consider the future. Suppose we use the number of users or the userid in a table datatype as
smallint or only
int even, then sometimes in future may be the number of users increases to a large extent and overlaps the range of the datatypes. Then here the problem arises. You might have heard about the youtube crash for the Gangnam song. The reason behind it was this ... the issue with the datatype for the number of views. It increased to such an extent that it crashed.
Relational Database Management System is based on the relational model. In simple terms, it is a type of database management which stores data in related tables.
Everything About SELECT Statement
We have already discussed about the SSMS where we will be demonstrating our queries. After you read through this part of the article, you would be able to understand the details of using the
SELECT statement, asking different types of questions to the
First of all, we store the data in a database to reuse it at some later point of time. Querying data is all about asking questions while retrieving data. Understanding the combination of English and the SQL helps us postulate the queries very easily.
Let's create a database first. Look below at the query.
CREATE DATABASE QUERY
CREATE DATABASE DemoDatabase;
The above query using the master creates the database named “
DemoDatabase”. You can also directly create database using the SSMS UI. See the below images:
When in the first image as you see, the New Database is clicked, another window pops up as shown in the second image. Add the database name there and it creates the database for you with a maxsize of unlimited.
Then we create a Table for the storage of records.
CREATING A TABLE
USE DemoDatabase CREATE TABLE [dbo].[User]( [UserId] [int] IDENTITY(1,1) NOT NULL, [UserName] [nvarchar](50) NULL, [Password] [nvarchar](50) NULL, [UserEmail] [nvarchar](50) NULL, CONSTRAINT [PK_User] PRIMARY KEY CLUSTERED ( [UserId] )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, _ ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
Now the above query creates a table for you, but we need to understand the commands used in the query as some seem understandable whereas some are not.
PAD_INDEX: Specifies the index padding. The default is
OFF. More information.
STATISTICS_NORECOMPUTE: Creation & Recreation of index & auto update of the columns statistics. More info.
ALLOW_ROW_LOCKS: Enhances performances by locking the entire table rather than locking individual row. By default, SQL sets to
ON. More info.
ALLOW_PAGE_LOCKS: Set to
ONwhen there are chances of high concurrency in order to improve performance. By default, SQL sets to
ON. More info.
Thus, we have created a database which now has a table named
User. Now for your information, we can also create table using the SSMS UI as we have done for the database.
We open up the database tree structure and find Tables as a child tree under it.
We right click on that and select New
Table, we get a different UI in the query window now asking for the column names and the datatypes also to set the
NonNullable with a check box, then lets view the images:
Add the Primary key column and set it as Primary Key like below:
Then we need to set the column as an identity column which will be auto-incremented by 1 everytime a new row is added. The developer need not bother about incrementing or setting the primary key. Set it as below:
Then, we add all the columns with the datatypes mentioned above and save the table with a proper table name. Now, here the checkbox for the Nullable column plays a vital role. Think and specify the value here. Finally, the UI looks like below:
Thus, now we have created a table also. Now it is time to add records/rows into it. We will be writing queries to insert the values into the table. Let's see how:
INSERT ROWS INTO TABLE
INSERT INTO dbo.[User] (UserName,Password,UserEmail) VALUES ('Suraj','abc123','firstname.lastname@example.org');
Here, as you can see, we have added only the three column values as the primary key
userId is auto-incremented, i.e., an Identity column. Thus, this adds the values. For a faster approach, usually developers use the SSMS UI here also to add the records.
We right click on the specific table into which we will be adding records, then click on Edit Top 200 (default value) items.
Then, we see in a new query window the table columns with the rows of records already added and a default empty row where we will be adding the values like below:
Here, we add the values in the row containing
NULL values. The Primary key column is non editable as it will increment itself. The other columns are populated with the values to set the table records.
Now let's get back to our topic of discussion
1. Select All Records From the Table
SELECT * FROM dbo.[User]
This query is very simple and easy to understand. Here, the
FROM clause is used which is easy when we retrieve data from a single table. As mentioned above ‘
*’ relates to all columns in a table. Thus, the above query returns all the records of all the columns.
Another point here is if we select some columns out of all columns, then we should/must qualify the column names in the
Select list of the query. We can bring in the concept of ‘aliasing’ in our next query which selects some columns in the table.
2. Select Fewer Columns from the Table
SELECT users.UserName as Name, users.Password as PWD FROM dbo.[User] as users
The above query returns only the User name column as
Name and the Password column as
PWD in the result set. This is the power of alias, it changes the column head names. Below is the sample result set:
Similarly, you can use shorter Table alias names and use it throughout your query for convenience and select as many and the specific columns you need from the table.
There are various ways for limiting the result sets we get. By default, all the rows that match the
From clause query are retrieved in the result set. There are two different ways to limit the result set of the
One is to add conditions to the Query after
Another is to add the
DISTINCT qualifier to the
Distinct is a result set qualifier. Also
TOP (number) can be used to limit the result set.
We will see the second type here and move to the condition type consequently.
3. Select Distinct Rows
SELECT DISTINCT users.UserName as Name FROM dbo.[User] as users;
This query returns all the rows that are distinct based on the names values.
4. Select Some Rows from TOP
SELECT TOP 500 users.UserName as Name FROM dbo.[User] as users;
This would return only the top 500 rows no matter how many in total we get as result set.
Next, we discuss about the conditions type:
This is used after the
FROM clause and can be thought of a search from the records using the conditions based on which we would be doing the result search. Searching with lot more precision. The body of the
where clause will be one or more expressions with return value as
False. If the response is
true based on the conditions for a specific row coming under the check, then the row gets added to the result set which is retrieved finally from the query.
A simple example here would be
Get the list of records/users whose name is ‘
5. Select Users with UserName as ‘Suraj'
SELECT * FROM dbo.[User] as users WHERE users.UserName = 'Suraj' ;
This selects all the users with the
Suraj, usually this is made as
Suraj. Thus the result set would look like below:
To master the
WHERE clause in SQL, we need to master how all the operators available behave. Let's have a look at the operators available with the
WHERE clause in SQL and how they would behave:
|Not Equals to ‘<>’||Returns
|Greater than ‘>’||Returns
|Less than ‘<‘||Returns
|Greater than or equal to ‘>=’||Returns
|Less than or equal to ‘<=’||Returns
Now let's have a look at the queries and how they behave when operators are applied. Here, I will be applying operators on the
6. Select the users with UserName greater than ‘Suraj'
SELECT * FROM dbo.[User] as users WHERE users.UserName > 'Suraj' ;
The result set here would return the name starting with alphabets more than ‘S’, but remember the check is for each letter until the greater than matches. Result set:
Thus the name starting with ‘
T’is displayed in the result.
Similarly, less than and the other operators would work out.
7. Select the users with UserName less than ‘Suraj'
SELECT * FROM dbo.[User] as users WHERE users.UserName < 'Suraj' ;
8. Select the users with UserName Not Equal to ‘Suraj’
SELECT * FROM dbo.[User] as users WHERE users.UserName <> 'Suraj' ;
Then let's peep into the
AND OR operators. As we all know, each boolean expression can be combined with other boolean expressions. These are combined by using the
AND OR operator.
AND means that both the boolean expressions combined by using
AND, both must evaluate to
True for the row to satisfy to be a part of the result set. Like for example, list out the users who have first name as
Suraj and email as a specific value. Here, as per the simple English language,
and is to be used here to evaluate and produce the result set.
9. Select the users with UserName as ‘Suraj’ & UserEmail as ‘email@example.com'
SELECT * FROM dbo.[User] as users WHERE users.UserName ='Suraj' AND users.UserEmail = 'firstname.lastname@example.org';
The result set goes as below:
Thus here in the result set, we can check that both the expressions on the sides of the
AND operator turns out to be
true for both the rows in the result set.
OR keyword means that any of the expressions on the either side of
OR operator can be
true, then the row becomes a part of the result set. That is when any of the expression satisfies then the row is a part. Just keep in mind if the first expression turns out to be
true, then the second expression is not evaluated. If first expression turns to be
false, then the second expression is evaluated to be checked.
10. Select the users with UserName as ‘Suraj’ or UserEmail as ‘email@example.com'
SELECT * FROM dbo.[User] as users WHERE users.UserName ='Suraj' OR users.UserEmail = 'firstname.lastname@example.org';
The result set is as below:
As we see first when the first expression, i.e.,
Suraj is a part of the result set and also the rows with
UserEmail as ‘
BETWEEN operator acts on any column and taking two values checking the range. If a row value lies in the range specified in the
BETWEEN condition, then the expression evaluates to
11. Select the users with Age Lying Between Specific Value
SELECT * FROM dbo.[User] as users WHERE users.Age BETWEEN 25 AND 50 ;
The result set is as below:
Thus, the result set includes only those rows where the
Age lies between
50, others are neglected. Between in inclusive of the values being checked.
LIKE operator is a special one which is used with
strings. When we give
LIKE operator to the
strings, then it looks and searches for those values of the
string which match the pattern specified in the
LIKE condition. For
LIKE ‘%’ is used which is otherwise called wild card symbol. The
% can be used anywhere in the
string. Like for example, who all are the users whose names start with ‘
12. Select the users with the UserName starting with letter ‘S'
SELECT * FROM dbo.[User] as users WHERE users.UserName LIKE 'S%' ;
This gives the result with all the
usernames starting with
S. Thus we write
Like ‘S%’ wild card symbol specified after the letter means starting with. Specifying wild card symbol at the beginning like ‘
%S’ means all usernames ending with
SELECT * FROM dbo.[User] as users WHERE users.UserName LIKE '%C' ;
Thus this is the result set with the records where the column value
UserName contains ‘
IN operator is another special operator which requires column and a list of specified/valued values. Values can be of any datatype. If a row column value matches any of the specified values in the list used for the
IN operator, then the row is added to the result set.
SELECT * FROM dbo.[User] as users WHERE users.UserName IN ('Suraj' , 'test') ;
This gives the result set having all
Users with user name
Suraj and test. Thus, the
IN operator seems to overlap with the
BETWEEN operator, but they are different in some scenarios.
IS operator is less used and a special keyword, which is used to check for
NULL values. An interesting thing to keep in mind is that the
NULL values are not assignable to the ‘
=’ operator & its complement is
Thus, here I have tried to specify the basics in SQL. As I mentioned, I have tried here, so any queries, suggestions & concerns are most welcome. Let us please discuss and clear out doubts and keep sharing.
- Introduction to SQL by John Flanders
- SQL authority by Pinal Dave
- W3 schools