PostgreSQL for Java Developers — The Features You Should Be Using
by Eric Hanson, Backend Developer at Clean Systems Consulting
JSONB — flexible schema without a document database
PostgreSQL's JSONB column stores arbitrary JSON with binary indexing. It's not a replacement for relational modeling — it's the right tool when part of your data is genuinely schema-less: user-defined attributes, event payloads, configuration, metadata that varies per record.
ALTER TABLE products ADD COLUMN attributes JSONB;
-- Store arbitrary attributes per product
UPDATE products SET attributes = '{
"color": "red",
"weight_kg": 1.5,
"dimensions": {"width": 10, "height": 20},
"tags": ["sale", "featured"]
}' WHERE id = 1;
-- Query inside JSONB
SELECT id, name FROM products
WHERE attributes->>'color' = 'red'
AND (attributes->>'weight_kg')::numeric < 2.0;
-- GIN index makes JSONB queries fast
CREATE INDEX idx_products_attributes ON products USING GIN(attributes);
-- Check containment — does the JSON contain this subset?
SELECT * FROM products
WHERE attributes @> '{"tags": ["sale"]}';
Using JSONB from Spring Boot:
@Entity
public class Product {
@Id private Long id;
private String name;
@JdbcTypeCode(SqlTypes.JSON)
@Column(columnDefinition = "jsonb")
private Map<String, Object> attributes; // Hibernate maps to JSONB
}
Or with JdbcTemplate for native queries:
@Repository
public class ProductRepository {
private final JdbcTemplate jdbcTemplate;
public List<Product> findByAttribute(String key, String value) {
return jdbcTemplate.query(
"SELECT * FROM products WHERE attributes->? = to_jsonb(?::text)",
productRowMapper,
key, value
);
}
public List<Product> findContaining(Map<String, Object> criteria) {
String criteriaJson = objectMapper.writeValueAsString(criteria);
return jdbcTemplate.query(
"SELECT * FROM products WHERE attributes @> ?::jsonb",
productRowMapper,
criteriaJson
);
}
public void updateAttribute(Long id, String key, Object value) {
String valueJson = objectMapper.writeValueAsString(value);
jdbcTemplate.update(
"UPDATE products SET attributes = attributes || jsonb_build_object(?, ?::jsonb) WHERE id = ?",
key, valueJson, id
);
}
}
attributes || jsonb_build_object(key, value) merges the new key-value into the existing JSONB without replacing the whole document — the PostgreSQL equivalent of a partial update.
When to use JSONB: data that genuinely varies per record (user-defined fields, event metadata, configuration), temporary storage during schema evolution, attributes you need to query occasionally but don't need strong schema enforcement. Don't use JSONB for data that's always structured and queried frequently — a proper column with an index outperforms JSONB for structured data.
Full-text search
PostgreSQL's full-text search handles stemming, stop words, ranking, and language awareness — the capabilities you'd otherwise reach for Elasticsearch to provide.
-- Add a tsvector column for full-text indexing
ALTER TABLE products ADD COLUMN search_vector tsvector;
-- Populate from multiple columns with different weights
UPDATE products SET search_vector =
setweight(to_tsvector('english', coalesce(name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(description, '')), 'B') ||
setweight(to_tsvector('english', coalesce(array_to_string(tags, ' '), '')), 'C');
-- GIN index on the tsvector
CREATE INDEX idx_products_search ON products USING GIN(search_vector);
-- Search with ranking
SELECT id, name,
ts_rank(search_vector, query) as rank
FROM products,
plainto_tsquery('english', 'wireless keyboard') as query
WHERE search_vector @@ query
ORDER BY rank DESC
LIMIT 20;
setweight assigns relevance weights: A (highest) for name matches, B for description, C for tags. ts_rank scores results by how well they match — results where the query matches the title rank higher than description-only matches.
Keeping the tsvector current — a trigger:
CREATE OR REPLACE FUNCTION update_product_search_vector()
RETURNS TRIGGER AS $$
BEGIN
NEW.search_vector :=
setweight(to_tsvector('english', coalesce(NEW.name, '')), 'A') ||
setweight(to_tsvector('english', coalesce(NEW.description, '')), 'B');
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER products_search_vector_update
BEFORE INSERT OR UPDATE OF name, description ON products
FOR EACH ROW EXECUTE FUNCTION update_product_search_vector();
The trigger fires only when name or description changes — BEFORE INSERT OR UPDATE OF name, description. Other column updates don't regenerate the search vector.
From Spring Boot:
@Query(value = """
SELECT p.*, ts_rank(p.search_vector, query) as rank
FROM products p,
plainto_tsquery('english', :searchTerm) as query
WHERE p.search_vector @@ query
ORDER BY rank DESC
LIMIT :limit
""", nativeQuery = true)
List<Product> fullTextSearch(@Param("searchTerm") String searchTerm,
@Param("limit") int limit);
For autocomplete and prefix matching, use to_tsquery with prefix operator:
-- "wire:*" matches "wireless", "wired", etc.
SELECT name FROM products
WHERE search_vector @@ to_tsquery('english', 'wire:*')
LIMIT 10;
Window functions — analytics without loading data into Java
Window functions compute values across a set of rows related to the current row, without collapsing the result set like GROUP BY does.
-- Running total per customer
SELECT order_id, customer_id, amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY created_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) as running_total
FROM orders;
-- Rank orders by amount within each customer, without collapsing
SELECT order_id, customer_id, amount,
RANK() OVER (PARTITION BY customer_id ORDER BY amount DESC) as rank
FROM orders;
-- Previous and next order amount for comparison
SELECT order_id, amount,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY created_at) as prev_amount,
LEAD(amount) OVER (PARTITION BY customer_id ORDER BY created_at) as next_amount
FROM orders;
From Spring Boot — using Tuple projections:
@Query(value = """
SELECT o.id,
o.customer_id,
o.amount,
RANK() OVER (PARTITION BY o.customer_id ORDER BY o.amount DESC) as spend_rank,
SUM(o.amount) OVER (PARTITION BY o.customer_id) as customer_total
FROM orders o
WHERE o.created_at > :since
""", nativeQuery = true)
List<Object[]> findOrdersWithRanking(@Param("since") Instant since);
Or map to an interface projection:
public interface OrderRankingView {
Long getId();
Long getCustomerId();
BigDecimal getAmount();
Long getSpendRank();
BigDecimal getCustomerTotal();
}
@Query(value = """
SELECT o.id,
o.customer_id,
o.amount,
RANK() OVER (PARTITION BY o.customer_id ORDER BY o.amount DESC) as spend_rank,
SUM(o.amount) OVER (PARTITION BY o.customer_id) as customer_total
FROM orders o WHERE o.created_at > :since
""", nativeQuery = true)
List<OrderRankingView> findOrderRankings(@Param("since") Instant since);
Window functions eliminate the pattern of loading records into Java, iterating, and computing rankings or running totals in application code. The database computes these results in a single scan — dramatically more efficient for large datasets.
Advisory locks — distributed coordination
PostgreSQL advisory locks provide distributed mutual exclusion using the database as a coordination point. No Redis, no Zookeeper required.
-- Session-level lock — held until explicitly released or session ends
SELECT pg_try_advisory_lock(12345); -- returns true if lock acquired
-- Transaction-level lock — automatically released at transaction end
SELECT pg_try_advisory_xact_lock(hashtext('process-daily-report'));
From Spring Boot — preventing concurrent job execution:
@Service
public class DailyReportService {
private final JdbcTemplate jdbcTemplate;
@Scheduled(cron = "0 0 2 * * *") // 2 AM daily
@Transactional
public void generateDailyReport() {
long lockKey = "daily-report".hashCode();
// Try to acquire lock — returns false if another instance holds it
Boolean lockAcquired = jdbcTemplate.queryForObject(
"SELECT pg_try_advisory_xact_lock(?)",
Boolean.class, lockKey);
if (!lockAcquired) {
log.info("Daily report generation already running on another instance, skipping");
return;
}
// Lock acquired — only one instance runs this
generateReport();
// Lock automatically released when transaction ends
}
}
pg_try_advisory_xact_lock is transaction-scoped — it releases when the transaction commits or rolls back. No risk of leaving locks held if the application crashes — PostgreSQL cleans up automatically.
Long-running distributed lock with session-level advisory locks:
@Transactional(propagation = Propagation.NOT_SUPPORTED) // outside transaction
public boolean acquireDistributedLock(String lockName) {
long lockKey = lockName.hashCode();
return Boolean.TRUE.equals(jdbcTemplate.queryForObject(
"SELECT pg_try_advisory_lock(?)",
Boolean.class, lockKey));
}
public void releaseDistributedLock(String lockName) {
long lockKey = lockName.hashCode();
jdbcTemplate.execute("SELECT pg_advisory_unlock(" + lockKey + ")");
}
Use session-level locks when the lock must survive multiple transactions. Use transaction-level locks when the lock should automatically release with the transaction.
LISTEN/NOTIFY — real-time events without polling
LISTEN/NOTIFY is PostgreSQL's pub/sub mechanism. One connection publishes events; other connections listening on the channel receive them immediately.
-- Publisher (can be a trigger or application code)
NOTIFY order_events, '{"orderId": 123, "status": "SHIPPED"}';
-- Listener
LISTEN order_events;
-- Receives notifications immediately when NOTIFY fires
Trigger-based notification on data change:
CREATE OR REPLACE FUNCTION notify_order_status_change()
RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('order_events',
json_build_object(
'orderId', NEW.id,
'status', NEW.status,
'updatedAt', NEW.updated_at
)::text
);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER order_status_change_notify
AFTER UPDATE OF status ON orders
FOR EACH ROW EXECUTE FUNCTION notify_order_status_change();
From Spring Boot — listening with PgConnection:
@Component
public class OrderEventListener implements InitializingBean, DisposableBean {
private final DataSource dataSource;
private Connection listenConnection;
private ScheduledExecutorService executor;
@Override
public void afterPropertiesSet() throws Exception {
// Dedicated connection for LISTEN — not from HikariCP pool
listenConnection = dataSource.getConnection().unwrap(PGConnection.class)
.getConnection();
listenConnection.setAutoCommit(true);
try (Statement stmt = listenConnection.createStatement()) {
stmt.execute("LISTEN order_events");
}
executor = Executors.newSingleThreadScheduledExecutor();
executor.scheduleWithFixedDelay(this::pollNotifications,
0, 100, TimeUnit.MILLISECONDS);
}
private void pollNotifications() {
try {
PGConnection pgConn = listenConnection.unwrap(PGConnection.class);
PGNotification[] notifications = pgConn.getNotifications(0);
if (notifications != null) {
for (PGNotification notification : notifications) {
handleNotification(notification.getParameter());
}
}
} catch (SQLException e) {
log.error("Error polling notifications", e);
}
}
private void handleNotification(String payload) {
try {
OrderStatusChange event = objectMapper.readValue(payload,
OrderStatusChange.class);
applicationEventPublisher.publishEvent(event);
} catch (Exception e) {
log.error("Error processing notification payload: {}", payload, e);
}
}
@Override
public void destroy() throws Exception {
executor.shutdown();
listenConnection.close();
}
}
LISTEN/NOTIFY eliminates polling loops. Instead of SELECT * FROM orders WHERE updated_at > :lastCheck every 5 seconds, the application receives notifications within milliseconds of the data change. For low-latency event propagation within a single PostgreSQL deployment, this is more efficient than introducing a message broker.
Table partitioning — managing large tables
Partitioning splits a large table into smaller physical tables while presenting a single logical table. Queries that filter on the partition key only scan relevant partitions — dramatically faster for time-series data.
-- Range partitioning by month
CREATE TABLE orders (
id BIGSERIAL,
customer_id BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL,
total NUMERIC(10,2) NOT NULL
) PARTITION BY RANGE (created_at);
-- Monthly partitions
CREATE TABLE orders_2026_01 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE orders_2026_02 PARTITION OF orders
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
-- Indexes created on the parent apply to all partitions
CREATE INDEX ON orders(customer_id);
CREATE INDEX ON orders(created_at);
With range partitioning on created_at, a query WHERE created_at > '2026-01-01' AND created_at < '2026-02-01' only scans the January partition — not the entire table. This is partition pruning.
Automatic partition creation — create future partitions before the month begins:
@Scheduled(cron = "0 0 1 28 * *") // 28th of each month
@Transactional
public void createNextMonthPartition() {
YearMonth nextMonth = YearMonth.now().plusMonths(1);
String partitionName = "orders_" +
nextMonth.format(DateTimeFormatter.ofPattern("yyyy_MM"));
String startDate = nextMonth.atDay(1).toString();
String endDate = nextMonth.plusMonths(1).atDay(1).toString();
jdbcTemplate.execute(String.format(
"CREATE TABLE IF NOT EXISTS %s PARTITION OF orders " +
"FOR VALUES FROM ('%s') TO ('%s')",
partitionName, startDate, endDate));
log.info("Created partition: {}", partitionName);
}
Partitioning is transparent to Hibernate and Spring Data JPA — queries and inserts against the parent orders table work without modification. The benefit is entirely in query performance and maintenance (dropping old partitions is instant — no DELETE required).
The PostgreSQL capabilities worth knowing
Each of these features eliminates application code that would otherwise be written in Java:
- JSONB eliminates a separate document database for flexible-schema data
- Full-text search eliminates Elasticsearch for moderate search requirements
- Window functions eliminate in-memory sorting and ranking in Java
- Advisory locks eliminate Redis or Zookeeper for distributed coordination
- LISTEN/NOTIFY eliminates polling loops for real-time event propagation
- Partitioning eliminates manual table management for large time-series datasets
The instinct to reach for additional infrastructure — a search cluster, a cache for coordination, a message broker for in-process events — is often a symptom of underutilizing the database already in the stack. PostgreSQL is capable of more than most Java applications ask of it.