Declaring models
One struct, one
A model is a struct with #[derive(Model)]. The derive generates the trait impl, the per-field column constants, and an objects() Manager that picks up the ambient database pool.
For the full list of supported field types, what each maps to on Postgres vs SQLite, and how to handle constraints the derive doesn't accept yet (varchar(64), default, unique, etc.), see Column types.
The minimal model
use umbral::prelude::*;use chrono::{DateTime, Utc}; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Post { pub id: i64, pub title: String, pub body: String, pub published_at: Option<DateTime<Utc>>,}The rules the derive enforces:
- One primary-key field. By default that is a field named
id; alternatively, mark any field with#[umbral(primary_key)](e.g. a string codename). Supported PK types: every Rust integer width (i8/i16/i32/i64andu8/u16/u32/u64, withi64the auto-increment default),uuid::Uuid(no auto-generation, you supply the value), andString(no auto-generation, you supply the value). A custom newtype works too with one line:impl umbral::orm::PrimaryKey for MyId {}(the type must lower to asea_query::Value). - Field types come from the M3 catalogue: signed and small-unsigned ints (
i16/i32/i64),f32/f64,bool,String,chrono::NaiveDate/NaiveTime/DateTime<Utc>,uuid::Uuid,ForeignKey<T>. serde_json::Valuefor a JSON column;Option<serde_json::Value>for a nullable one.Option<T>of any of the above is the only path to a nullable column.
PK auto-generation rule. Manager::create(instance) skips the id column in the INSERT when the value is the "default" sentinel, letting the database assign it:
| PK type | Sentinel value | Result |
|---|---|---|
i64 / i32 | 0 | Database assigns the next autoincrement value. |
uuid::Uuid | Uuid::nil() (all zeros) | Column is omitted from INSERT. Supply a real UUID or add a DB-side default. |
String | "" (empty string) | Column is omitted from INSERT. Supply the slug at create time. |
Pass a non-sentinel value to force a specific PK (useful for seeding or test fixtures).
JSON columns
A field typed serde_json::Value maps to JSONB on Postgres (binary storage, indexable, with Postgres's full operator surface) and to TEXT on SQLite (sqlx's json feature handles serde on the boundary). The same struct compiles and runs against either backend.
use umbral::prelude::*; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Event { pub id: i64, pub kind: String, pub payload: serde_json::Value, // JSONB / TEXT, required pub meta: Option<serde_json::Value>, // nullable JSON}JSON operators
Two Postgres-native predicates let you query inside a JSON column directly from a filter chain.
path_text(keys: &[&str]): extracts a nested value as text using chained->steps plus a final->>. Returns a builder; finalize with.eq(s),.ne(s),.is_null(), or.is_not_null().has_key(key: &str): true when the JSON object contains the given top-level key (Postgres?operator). Returns aPredicatedirectly.
use umbral::prelude::*;use serde_json::Value; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Event { pub id: i64, pub payload: Value,} // Deep path equality, renders as "payload" -> 'author' ->> 'name' = 'alice'let alice_rows = Event::objects() .filter(event::PAYLOAD.path_text(&["author", "name"]).eq("alice")) .fetch_pg(&pg_pool) .await?; // Top-level key presence, renders as "payload" ? 'status'let has_status = Event::objects() .filter(event::PAYLOAD.has_key("status")) .fetch_pg(&pg_pool) .await?;SQLite JSON1 operator support is deferred; is_null() / is_not_null() and asc() / desc() ordering remain cross-backend today.
Array columns
A field typed Vec<T> maps to a native Postgres array column (T[]). Option<Vec<T>> produces a nullable array column. Array fields are Postgres-only: the boot-time system check fails with a clear error naming the model and field if an Array field is registered against a SQLite backend.
Supported element types: i16, i32, i64, f32, f64, bool, String, uuid::Uuid.
use umbral::prelude::*; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Event { pub id: i64, pub kind: String, pub tags: Vec<String>, // TEXT[], required pub scores: Option<Vec<i64>>, // BIGINT[], nullable} // Fetch rows, use the _pg terminal for models with PG-only fieldslet events: Vec<Event> = Event::objects().fetch_pg(&pg_pool).await?;Models that include at least one PG-only field cannot use the cross-backend .fetch() / .first() / .count() / .exists() terminals. Use the _pg variants instead: .fetch_pg(&pool), .first_pg(&pool), .count_pg(&pool), .exists_pg(&pool). The pool is passed at the terminal, not on the manager. See docs/specs/04-orm-model-and-fields.md for the full _pg surface.
Array operators
Filter on array contents with four Postgres-native predicates:
.contains(elem): column array contains the given element (@>)..contains_all(iter): column contains every value in the iterator (@>)..contained_by(iter): column is a subset of the iterator's values (<@)..overlaps(iter): column shares at least one value with the iterator (&&).
let info_events = Event::objects() .filter(event::TAGS.contains("info")) .fetch_pg(&pg_pool) .await?; let any_match = Event::objects() .filter(event::SCORES.overlaps([10i64, 20])) .fetch_pg(&pg_pool) .await?;Empty-input semantics are well-defined: contains_all([]) is always true; contained_by([]) and overlaps([]) are always false.
Network address columns
Postgres natively stores and indexes IP addresses and MAC addresses. Three Postgres-only field types expose this: INET for host addresses with an optional netmask, CIDR for network addresses (host bits must be zero), and MACADDR for hardware addresses.
| Rust type | SqlType | Postgres column |
|---|---|---|
ipnetwork::IpNetwork | Inet | INET: IPv4 or IPv6 host address, optional prefix length |
ipnetwork::IpNetwork | Cidr | CIDR: network address, host bits enforced zero by Postgres |
mac_address::MacAddress | MacAddr | MACADDR: 6-byte hardware address |
The derive defaults ipnetwork::IpNetwork to Inet. Opt into the stricter CIDR semantics with the field-level #[umbral(cidr)] attribute (the Rust type stays ipnetwork::IpNetwork; only the column type changes):
#[umbral(cidr)]pub network: IpNetwork, // CIDR instead of INETuse umbral::prelude::*;use ipnetwork::IpNetwork;use mac_address::MacAddress; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Node { pub id: i64, pub addr: IpNetwork, // INET, required pub mac: MacAddress, // MACADDR, required pub fallback: Option<IpNetwork>, // INET, nullable} // Filter by exact address, use _pg terminals for PG-only modelslet target = "10.0.0.1/24".parse::<IpNetwork>()?;let nodes: Vec<Node> = Node::objects() .filter(node::ADDR.eq(target)) .fetch_pg(&pg_pool) .await?;The predicate surface ships .eq(val) / .ne(val) / .asc() / .desc(), plus .is_null() / .is_not_null() on nullable variants. Network containment operators (<<, >>) are deferred to a later phase.
Full-text search columns
Postgres's tsvector type stores a pre-processed lexeme document that supports fast @@ full-text matching. umbral exposes this as the umbral::orm::TsVector newtype: a String wrapper with sqlx Type/Encode/Decode impls that map to a tsvector column. Like array and network fields, TsVector is Postgres-only; the boot-time system check fails with a clear error if the field is registered against a SQLite backend.
use umbral::prelude::*;use umbral::orm::TsVector; #[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Doc { pub id: i64, pub title: String, pub search: TsVector, // tsvector, required pub extra: Option<TsVector>, // tsvector, nullable} // User-friendly query syntax: spaces=AND, OR, -word, "quoted phrase"let results = Doc::objects() .filter(doc::SEARCH.matches_websearch("rust framework")) .fetch_pg(&pg_pool) .await?;Two matching predicates are available on every TsVector column:
.matches_websearch(query):col @@ websearch_to_tsquery($1). Accepts natural language: spaces mean AND, bareORmeans OR,-wordnegates,"quoted phrase"requires adjacency. Preferred for search boxes..matches(query):col @@ to_tsquery($1). Requires stricttsquerysyntax (&,|,!,:*for prefix). Use it when you control the query string programmatically.
Both predicates compile only for Postgres; use the _pg terminal variants (.fetch_pg, .first_pg, etc.) on any model that includes a TsVector field.
Indexes are automatic. A tsvector column needs a GIN index to be searchable, so the migration engine emits one for you (CREATE INDEX "idx_<table>_<col>_gin" ON "<table>" USING GIN ("<col>")) for every TsVector field, on both CreateTable and a later AddColumn. You don't write #[umbral(index)] and you don't hand-author the GIN DDL; migrate does it. Populating the tsvector from your text columns (running to_tsvector(...) via a trigger or a generated column) is still yours to set up; umbral stores and queries the tsvector, it doesn't yet generate it from source text at write time.
What full-text search matches, and what it doesn't
Full-text search matches lexemes: whole words, normalised by the language config (stemming, case-folding, stop-word removal). It is not substring search and not fuzzy/reverse matching. Against a document "The best product ever made":
| Query | Matches? | Why |
|---|---|---|
matches_websearch("best product") | ✅ | both lexemes present; a space means AND |
matches_websearch("best") | ✅ | best is a lexeme (and matches Best, case-folded) |
matches_websearch("products") | ✅ | stemmed to the same root as product |
matches_websearch("tseb") | ❌ | tseb (best reversed) is not a word in the document |
matches_websearch("rodu") | ❌ | a substring inside product is not a lexeme |
matches("prod:*") | ✅ | :* is a prefix match, the supported "partial word" |
A "reverse" or arbitrary substring like tseb will never match best; no full-text engine does that. The closest thing to partial-word matching is the prefix form (prod:* via .matches, or a leading word in .matches_websearch). These rows are pinned by a live-Postgres test (fts_matches_lexemes_not_substrings_or_reverses).
Then why use it, if ?search= can match rodu and it can't?
Matching rodu inside product is rarely what you want. A substring scan (?search=, LIKE '%term%') is literal; a tsvector understands words, and the trade-off mostly runs the other way:
?search= / LIKE '%term%' (substring) | tsvector + .matches_websearch (full-text) | |
|---|---|---|
run finds | running ✅ but misses ran; false-matches brunch, prune | running, ran, runs ✅ (stemmed); not brunch |
cat finds | category, concatenate, education (false positives) | the word cat only |
| stop words | the best requires the literal the | the ignored; best is what counts |
| operators | one literal substring | best AND cheap, best OR great, -discontinued, "exact phrase" |
| ranking | none | ts_rank orders by relevance |
| speed at scale | '%term%' can't use an index → full table scan, O(n) | GIN index → fast on millions of rows |
So rodu failing is the point: full-text search indexes meaning (stemmed words), not character fragments. Reach for ?search=/LIKE for "find this literal fragment" on small tables (SKUs, codes, exact substrings); reach for a tsvector for word-aware, ranked, index-backed search that scales.
Reading a tsvector column: you get lexemes, not your text
A tsvector is a search index in column form: it stores the lexeme document, not your original text. to_tsvector('english', 'The Best Product ever made') stores:
'best':2 'ever':4 'made':5 'product':3Stemmed, lower-cased, stop-words dropped (The is gone), with word positions. The original phrasing, casing, and word order are not recoverable from it. So when you fetch a model, the TsVector field hands you that lexeme string, not "The Best Product ever made":
let doc = Doc::objects().first_pg(&pool).await?.unwrap();doc.search.as_str(); // "'best':2 'ever':4 'made':5 'product':3" ← not display textdoc.title; // "The Best Product ever made" ← keep + show THISThe takeaway: keep your real text in a normal TEXT column (title, body), which is what you display and return in API responses, and derive the tsvector from it for searching. The two columns play different roles. Mark the TsVector field #[serde(skip)] if you don't want the raw lexeme vector leaking into JSON responses.
Full-text search is not the REST ?search= parameter. They are two different mechanisms:
- REST
/api//?search=termis a case-insensitive substring match (UPPER(col) LIKE UPPER('%term%'), case-insensitive contains) across the resource's text columns.tsvectorcolumns are skipped by?search=. So?search=best productfinds rows whose text literally contains the substringbest product, and?search=tsebfinds rows literally containingtseb; forgiving, but it doesn't stem, rank, or use the GIN index. TsVector+.matches()/.matches_websearch()is the ORM-level, index-backed, lexeme-based search shown above. There is currently no automatic?search=→tsvectorwiring; to expose full-text search over HTTP today you add a custom resource action or a dedicated query parameter that calls.matches_websearch(...). (Wiring?search=to prefer a model'stsvectorcolumn when one exists is a tracked enhancement.)
Rule of thumb: reach for ?search= for quick "contains this text" filtering on small/medium tables; reach for a TsVector column when you want word-aware, ranked, index-backed search at scale.
The table name
By default the table is snake_case of the struct name: BlogPost becomes blog_post. Override with the struct-level attribute when you need a different name:
#[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]#[umbral(table = "auth_user")]pub struct User { pub id: i64, pub username: String,}Beyond table, the struct-level #[umbral(...)] also accepts plugin, display, icon, database, singleton, soft_delete, unique_together = [[...]], indexes = [[...]], and ordering = ["-col", "col"]. The field-level #[umbral(...)] accepts primary_key, unique, index, default = "...", on_delete = "..." / on_update = "...", auto_now / auto_now_add, max_length = N, min = N / max = N, string, noform, noedit, help = "...", example = "...", widget = "...", backend = "...", cidr, db_constraint = false, slug_from = "...", choices, reverse_fk = "...", and no_reverse. (choices pairs #[derive(Choices)] on the enum with #[umbral(choices)] on the field; see Column types - Using a Rust enum.)
Display hints (string, max_length)
Two field-level attributes shape how the admin renders a model when no list_display is configured:
#[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]pub struct Article { pub id: i64, #[umbral(string, max_length = 60)] pub title: String, pub body: String, pub published_at: Option<chrono::DateTime<chrono::Utc>>,}#[umbral(string)]: the model's human-readable label field (display field), opt-in. Tagging one field per model switches the admin's defaultlist_displayto that single column when no explicit config is registered. Without any field tagged, the changelist shows every column. The PK isn't rendered as a separate column because every row in the admin is clickable into the sheet, so the id is already reachable. The attribute is never required; it's how you ask for the compact single-column form. Equivalent shapes:#[umbral(string)]and#[umbral(string = true)].#[umbral(max_length = N)]: soft truncation cap for display. The admin trims the value atNcharacters when rendering it inlist_display, appending…so the cut is visible. Counted in characters, not bytes, so emoji and combining characters survive intact.
Both attributes are display-only; they do not constrain the column at the SQL level. A String field with max_length = 60 is still TEXT (unbounded) at the database; the cap only affects how the admin renders cells. Use #[umbral(table = "...")] plus a manual CHECK constraint via a custom migration if you want the limit enforced at the DB.
Plugin-namespaced tables
When two plugins each ship a model with the same struct name (for example, a blog plugin and a shop plugin both defining Post), the SQL table names collide. Add plugin = "..." to opt into a <plugin>_<table> prefix:
#[derive(Debug, Clone, sqlx::FromRow, umbral::orm::Model)]#[umbral(plugin = "blog")]pub struct Post { pub id: i64, pub title: String, pub body: String,}// Model::TABLE == "blog_post"The prefix rule:
| Attributes set | Result |
|---|---|
neither plugin nor table | bare snake_case (post) |
plugin = "blog" only | "blog_post" |
table = "custom" only | "custom" (explicit always wins) |
plugin = "blog" + table = "custom" | "custom" (explicit always wins) |
When to reach for plugin = "...": multi-plugin apps where two or more plugins ship a model with the same struct name, or any time you want the table name to advertise which plugin owns it. The trade-off is longer table names in the DB.
Built-in plugins keep their bare names. umbral-auth, umbral-sessions, umbral-admin, and umbral-tasks do NOT use this attribute. Their table names (auth_user, session, etc.) are stable DB identifiers; switching them would break every existing database. If you are writing a third-party plugin, you should opt into plugin = "..." for every model the plugin ships to avoid future collisions with other plugins.
Model::NAME (the Rust struct name) is not affected by the plugin attribute; it stays "Post" regardless of which table name is used. The migration engine uses Model::NAME as the stable identity key when detecting table renames (see Managed migrations).
Reading rows
Post::objects() returns a Manager<Post> that builds a QuerySet. No pool argument; the manager reaches the ambient pool published by App::build().
let recent = Post::objects() .filter(post::PUBLISHED_AT.is_not_null()) .order_by(post::PUBLISHED_AT.desc()) .limit(20) .fetch() .await?;Each field gets a typed column constant under the snake_case module, named in SCREAMING_SNAKE_CASE: post::ID, post::TITLE, post::BODY, post::PUBLISHED_AT. The comparison methods are checked at compile time.
Terminals: .fetch(), .first(), .get(), .count(), .exists()
| Terminal | Returns | When to reach for it |
|---|---|---|
.fetch() | Result<Vec<T>, sqlx::Error> | The collection: list views, anything that maps over rows. |
.first() | Result<Option<T>, sqlx::Error> | "Give me one if there is one." None is a normal case. |
.get() | Result<T, GetError> | Exactly one row. Three branches: Ok(row), Err(NotFound), Err(MultipleObjectsReturned). |
.count() | Result<i64, sqlx::Error> | A row count without materializing rows. |
.exists() | Result<bool, sqlx::Error> | A LIMIT 1 existence probe. |
.get() is the shape for "this filter should pin one row exactly." MultipleObjectsReturned is a data-integrity signal, not a control-flow case; it fires when filters that should hit a UNIQUE constraint stop being unique.
use umbral::orm::GetError; match Post::objects().filter(post::ID.eq(42)).get().await { Ok(p) => render(p), Err(GetError::NotFound) => not_found(), Err(GetError::MultipleObjectsReturned) => unreachable!("id is unique"), Err(GetError::Sqlx(e)) => internal_error(e),}Manager::get(predicate) is the one-liner sugar:
let user = User::objects().get(user::EMAIL.eq(addr)).await?;Writing rows
Four write primitives complement the read side. All return a Result<_, umbral::orm::WriteError> whose variants name the offending field on type mismatch / null-on-required-column / unknown-column errors.
create(instance): INSERT one row
let new_post = Post { id: 0, // 0 is the autoincrement sentinel title: "Hello".into(), body: "World".into(), published_at: None,}; let inserted = Post::objects().create(new_post).await?;assert!(inserted.id > 0); // autoincrement PK populatedThe integer 0, the nil UUID, and the empty String are all treated as autoincrement-PK sentinels; the column is omitted from the INSERT so the database assigns one. An explicit non-default PK (id = 999) is bound as-is, so callers with UUID PKs or slug PKs can set them themselves.
The returned struct comes from RETURNING *, so timestamps that default to NOW() server-side land populated.
bulk_create([...]): INSERT many in one statement
let rows = (1..=100) .map(|i| Post { id: 0, title: format!("post {i}"), body: "...".into(), published_at: None }) .collect(); let count = Post::objects().bulk_create(rows).await?;assert_eq!(count, 100);One INSERT with N value tuples, far faster than calling .create() in a loop. Returns the affected-rows count rather than the populated rows; if you need the inserted IDs, follow up with a .filter(...).fetch().
.delete(): DELETE matching rows
let removed = Post::objects() .filter(post::PUBLISHED_AT.lt(cutoff)) .delete() .await?;Applies the accumulated filter predicates as the WHERE clause and returns the deleted-rows count. Without .filter(...), deletes every row. The type system can't tell "I forgot the filter" from "I really meant to truncate."
.update_values(map): UPDATE matching rows
use serde_json::{json, Map}; let mut updates = Map::new();updates.insert("title".into(), json!("Rebranded"));updates.insert("body".into(), json!("New copy.")); let touched = Post::objects() .filter(post::ID.eq(42)) .update_values(updates) .await?;Takes a serde_json::Map<String, Value> of column → new value. Each value is converted to the right sea-query type per the column's SqlType. PATCH semantics: columns absent from the map keep their current values. The PK column is silently skipped if present in the map (rewriting the PK while filtering on the old one is a footgun).
get_or_create(predicate, defaults): INSERT only when missing
Fetch the first row matching predicate; if none exists, insert defaults and return it. The second tuple element says whether a write happened so the caller can branch on first-time-ever vs already-seeded.
let (ct, created) = ContentType::objects() .get_or_create( content_type::APP_LABEL.eq("blog") & content_type::MODEL.eq("post"), ContentType { id: 0, app_label: "blog".into(), model: "post".into(), }, ) .await?;if created { tracing::info!("first boot, seeded blog.post content type");}Two round-trips on the miss path (filter+first, then create), one on the hit path. Pair with a UNIQUE constraint on the predicate columns for at-most-one semantics under concurrent inserters; the constraint is the backstop a concurrent caller hits if both pass the filter check before either creates.
Common shapes:
- Seeding on first boot. A plugin's
on_readywalks a list of canonical rows andget_or_creates each. Idempotent regardless of how many times the app restarts. - Find-or-make. A handler that needs a tag, category, or content-type row to attach to a write:
Tag::objects().get_or_create(tag::SLUG.eq(slug), Tag { ... }).
upsert(row): INSERT or UPDATE keyed on the primary key
INSERT-or-UPDATE for cases where you have a fully-formed row and want it to "win" against any existing row with the same PK. Sea-query emits INSERT ... ON CONFLICT(<pk>) DO UPDATE SET col = excluded.col, ...; the SQLite and Postgres syntax happens to match.
let cached = CacheEntry::objects() .upsert(CacheEntry { key: "homepage:html".into(), value: rendered_html.into_bytes(), expires_at: Some(Utc::now() + Duration::seconds(60)), }) .await?;Returns the row as the DB stored it post-upsert. Every non-PK column in the supplied instance overwrites the existing one on a conflict; columns omitted from the serialised instance (rare with #[derive(Model)] since every field is serialised) keep their current value via sea-query's DO NOTHING shape.
Compared to get_or_create, upsert is the right call when you want to OVERWRITE on a conflict rather than preserve the existing row. get_or_create is read-then-create; upsert is "this is the canonical row now".
What's next
- Register the model with the migration engine so makemigrations tracks it.
- Relationships (ForeignKey): declare
ForeignKey<T>fields, query by FK, and eager-load withselect_related. - Transactions: wrap multiple writes in
umbral::transaction(...)to commit or roll back atomically. - Querying: F-expressions, Q-objects, exclude, projections, sugar (earliest/latest/distinct/explain), mutate-side terminals, subqueries + set ops, JSON column ops.
- Read the spec at
docs/specs/04-orm-model-and-fields.mdfor the full field-type catalogue and the QuerySet surface.