Types and Affinities in SQLite

Types and Affinities in SQLite

SQLite’s data typing model is extremely flexible. Most SQL database engines are statically typed, where data types are defined and enforced by the column a value belongs to. SQLite, in contrast, uses dynamic “manifest” typing, where each value is associated with its own data type, regardless of the column’s explicitly declared type (also known as an “affinity” - more on that below).

How are Data Types encoded?

SQLite uses a serialization mechanism known as “record format”, where each record contains a header which defines the types and size of the stored data, followed by the actual data payload. Each value stored in SQLite is prefixed with a serial type code, a variable-length integer that tells SQLite how to interpret the data in the record. Different data types are encoded and stored in different ways to optimize space - for example, INTEGER values are stored between 1 to 8 bytes. This variable length allows SQLite to minimize the storage size of smaller integers.

Storage Classes in SQLite

Values in SQLite can be associated with one of 5 different storage classes. Storage classes are a generalized data type (for example, there are 7 integer data types of varying size, but one INTEGER storage class). While data typing changes how data is stored on disk, when that data is read into memory for processing, it is converted and handled as one of the 5 storage classes.

The five storage classes include some familiar faces, like the NULL datatype, which behaves as you might expect from a null value. For numeric values, there is the INTEGER type for signed integers, and REAL for floating point values (stored as 8-byte IEEE floats).

For character data, there is the TEXT data type, and for everything else, there is BLOB, which stores data exactly as it was input. Data types common in other SQL database engines like Date and Time are not present, though SQLite does come with functions to store dates and times as TEXT, REAL, or INTEGERs.

There is also no Boolean data type - true and false are represented by 1 and 0 (note that “TRUE” and “FALSE” are now aliases for these integer values).

While you can create a table with column types like VARCHAR, TINYINT, or FLOAT, these typings will not be enforced. Instead, types are generalized through the concept of “type affinity”.

Note: Unlike VARCHAR(n) in PostgreSQL, where n is the maximum number of characters the engine will allow, the maximum size of a string (or blob) in SQLite is determined by the SQLITE_MAX_LENGTH configuration option (which defaults to 1 billion bytes). You can still use VARCHAR(n) in your SQLite statements, but SQLite won’t actually enforce the limit for you - this is a common theme in how SQLite handles data. If it fits on disk, it will get stored, regardless of what kind of data is inserted into which column.

What is Type Affinity?

A column’s type affinity (also known as its column affinity) is essentially a recommendation for the data type a column should store. Any column can still store any type of data, but a column’s type affinity affects how SQLite handles inserting data into columns when there is a mismatch between the data’s type and the columns explicitly (or implicitly) declared affinity.

Each column in a SQLite table is assigned one of the following column affinities:

  • INTEGER (for columns with a declared type containing “INT”, like TINYINT or BIGINT)

  • TEXT (for columns with a declared type containing "CHAR", "CLOB", or "TEXT")

  • REAL (for columns with a declared type containing “REAL”, “FLOA” or “DOUB”

  • BLOB (for columns with a declared type of BLOB or no declared type).

  • NUMERIC (for columns with all other declared types, like NUMERIC, DECIMAL(10,5), BOOLEAN, DATE, and DATETIME)

Columns with a NUMERIC affinity can contain values using all five storage classes. When text is inserted into a NUMERIC column, the storage class of the value is converted to INTEGER or REAL (if its a well-formed integer or real literal, respectively.) NULL and BLOB values are not converted.

Note that this means that ‘3.0e+5’ inserted into a column with a NUMERIC affinity will be stored as 300,000, which may take some getting used to.

Columns with an INTEGER affinity behave the same as columns with a NUMERIC affinity. The only difference between the affinities is when a CAST expression is used. For example, CAST(7.0 AS INT) returns 7, while CAST(7.0 AS NUMERIC) returns the floating-point 7.0.

Columns with a TEXT affinity store data as TEXT, BLOB or NULL. They convert integers and real numbers into TEXT before storing. BLOB columns have no preference of storage class, so no conversion takes place.

This may seem like a lot to remember, especially if you’re coming from a more straightforward static typing system. SQLite goes out of its way to provide this flexibility for a couple reasons. It allows SQLite to be used in cases when schemas and data types might not be known at compile time, or will vary across deployments. This is especially useful for diverse or unpredictable ingestion. SQLite comes with the adaptability of a document database like MongoDB, but without sacrificing SQL.

Allowing columns to store any type of data also simplifies database schema design and modification, though this can be a double-edged sword. It’s great for quick prototyping, but for production use cases, make sure to follow the principles of good schema design to optimize the queryability of your data, and to minimize the chances of painful migrations and breaking changes down the line.

Using STRICT

In November of 2021, SQLite introduced the STRICT keyword, which enables type checking on columns. In a STRICT table, columns can only be assigned the INT, INTEGER, REAL, TEXT, BLOB, and ANY data types. NULLs can be inserted in any column (unless it is declared with the NOT NULL constraint). When a value whose type does not match the explicit column type, a runtime error is thrown. Combining a STRICT table schema with the ANY type is a great way to get all the benefits of a flexible data model while minimizing the surface area of unexpected behavior.

Transitioning to SQLite

If you’re coming from a traditional client/server database like PostgreSQL or MySQL, you may find SQLite’s typing system a bit quirky at first. Once you get the hang of it, you might start to appreciate the ability to ingest data in all shapes and sizes, regardless of schema. Or, you can always stick to STRICT!

If you’re already building with SQLite, or just want a performant real-time database with local-first capabilities, sign up for SQLite Cloud.

Until next time!