Databases

a very quick primer

2017-02-28

by

Kurt Neufeld

https://goo.gl/6MgbSx

Caveats

I'm not a DBA or otherwise DB expert.

I have used them a few times though.

Databases

Databases are great.

Pretty much the first killer app of computers.

What is a Database?

A database is an organized collection of data.
— Wikipedia

Thanks Wikipedia!

What is a Database?

Think of a spreadsheet:

  • the Excel file is the database
  • a page is a table
  • columns have types
  • each row is a record
  • a row is uniquely identified by a primary key

Excel?

Where even Access slays Excel is why databases exist.

  • the types of a column are enforced
  • can't put "abc" in an integer column
  • a column can point at another table
  • this is called a ForeignKey
  • this is where the relation in relational database comes from
  • columns can be indexed for fast lookups
  • a few thousand other reasons...

SQL

Structured Query Language

SQL is the programming language that is used to interact/control a database.

Reads sorta like English, a declarative language.

In my opinion sqlite.org has the best docs.

Warning: each DBMS has slightly different sql syntax.

Because computers.

SQL - Example


-- get all the columns from the users table
select * from users;
        

begin; -- transaction
-- debit an account
update accounts set balance = balance + 100
where id = 111;
-- credit an account by the same amount
update accounts set balance = balance - 100
where id = 222;
commit; -- only if both succeed apply either
```
        

ACID

ACID is a set of properties that make a database a database and not random software with a save icon.

A(tomic)
requires that each transaction is all or nothing
C(onsistent)
each transaction moves db from consistent state to consistent state
I(solation)
concurrent transactions never interact
D(urability)
a committed transaction will remain so even if DB crashes/power loss/etc

Different databases are a little more acidic than others.

3NF

Without going all CompSci on you, there's this thing called Third Normal Form or 3NF for short.

Or just normalized

Basically it means, do not duplicate data.

## Users | id | name | company | company addr | |----|-------|---------|--------------| | 1 | Alice | Acme | 123 1st St | | 2 | Bob | Acme | 123 1st St | | 3 | Eve | NSA | 7th Circle |
## Users | id | name | company_id | |----|-------|------------| | 1 | Alice | 1 | | 2 | Bob | 1 | | 3 | Eve | 666 |
## Company | id | name | address | |----|------|---------| | 1 | Acme | 123 1st St | | 666 | NSA | 7th Circle |

MS Access

You're in the wrong room

No matter what room you're in

MS SQL Server

I hear it doesn't suck

But I wouldn't risk it

Oracle

Cons

  • probably the most evil IT company on the planet
  • expensive
    • $20k/core/machine
    • prod, dev, qa, shards, backup, etc
    • VM host not guest
  • do not recommend

SQLite

A single file, easy to backup and distribute.

SQLite is a library not a process.

Use Case

  • use instead of making your own file format
  • prototyping some django stuff
  • embedded systems

MySQL

MySQL comes with two engines. Never use MyISAM, use InnoDB.

InnoDB is probably the default when creating a database, but be aware that this exists.

Or not, since you'll never use this DB anyways.

https://grimoire.ca/mysql/choose-something-else

MySQL - Pros

  • open source (since '95)
  • virtually everywhere
  • lots of 3rd party support
  • simpler access/authentication model
  • https://www.phpmyadmin.net/

MySQL - Cons

  • not really ACID
  • bad defaults
  • will accept/truncate invalid data but show warning (somewhere)
  • doesn't have much in the way of transactions
  • bought by Oracle, fork is MariaDB
  • it's the javascript/php of databases
  • do not recommend

PostgreSQL

  • http://www.slideshare.net/anandology/ten-reasons-to-prefer-postgresql-to-mysql
  • http://www.slideshare.net/PGExperts/development-of-83-in-india

PostgreSQL - Pros

  • pretty sweet logo
  • open source (since '95, successor code from '86)
  • significantly more sophisticated than MySQL
    • can have multiple schemas
    • has views
    • has transactions (each statement is a transaction)
    • more/better data types (JSON, arrays, gis, etc)
    • NOTIFY/TRIGGERS/stored procedures/etc
    • utf-8 is actually utf-8
  • fully ACID
  • the team behind it really seem to know what they're doing
  • AAA+++ Aardvark! 10/10 would install again!!1!

PostgreSQL - Cons

  • significantly more sophisticated, aka: more complicated
  • not as widely available
  • has mistaken impression of being slow
  • would still recommend

The End

questions?