DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Frameworks

A framework is a collection of code that is leveraged in the development process by providing ready-made components. Through the use of frameworks, architectural patterns and structures are created, which help speed up the development process. This Zone contains helpful resources for developers to learn about and further explore popular frameworks such as the Spring framework, Drupal, Angular, Eclipse, and more.

icon
Latest Premium Content
Trend Report
Low-Code Development
Low-Code Development
Refcard #288
Getting Started With Low-Code Development
Getting Started With Low-Code Development
Refcard #348
E-Commerce Development Essentials
E-Commerce Development Essentials

DZone's Featured Frameworks Resources

Liquid Glass, Material 3, and a Lot of Plumbing

Liquid Glass, Material 3, and a Lot of Plumbing

By Shai Almog DZone Core CORE
It has been one of those weeks where the diff is bigger than the headline. The headline is short — Codename One now ships modern native themes: an iOS "liquid glass" look and an Android Material 3 look, bundled into the iOS and Android ports, on by default in the Playground, and selectable from a brand new menu in the simulator. The diff behind that headline is several thousand lines across the platform ports, the simulator, the GUI plumbing, and a small army of screenshot tests. What is Codename One? Codename One is an open-source framework for building native iOS, Android, desktop, and web apps from a single Java or Kotlin codebase. Learn more at codenameone.com. The theme behind the work is simple: Codename One should look modern out of the box on every platform we ship to, and it should feel fast. Almost everything in the past week of commits is in service of one of those two goals. Try It Right Now in the Playground The easiest way to see any of this is the Playground. The Playground now defaults to iOS Modern when the device toggle is set to iPhone and Android Material 3 when it is set to Android, in both light and dark mode. No setup, no pom.xml, no build hints — just open the page, drop in any of the standard components, and the modern look is what you get. If the past releases of Codename One looked dated to you, the Playground is where to start. The simulator is the second-easiest place. We will get to that. The New Native Themes For most of Codename One's life, the iOS native theme has been the venerable iOS 7 flat theme, and the Android native theme has been Holo Light. Both still ship — backward compatibility has always been one of our most important goals — but they are no longer where we want a brand new app to start. We spent the bulk of this week building two new themes that target current platform aesthetics: iOS Modern – Apple system colors (accent #007aff light / #0a84ff dark, grouped-form surfaces, the system separator palette), pill borders for tabs, an iOS-Settings-style MultiButton, CHECK_CIRCLE-style checkbox glyphs, and translucent surfaces for Dialog and TabsContainer so they read as glass-frosted on top of whatever is behind them. It is not a real UIVisualEffectView backdrop — that is a port-side primitive we have not built yet — but the look is much closer to the iOS 26 vibe than anything we have shipped before.Android Material 3 – the Material 3 baseline tonal palette (primary #6750a4 light / #d0bcff dark, surface-container tiers, elevated containers approximated tonally because real elevation drop-shadows are still on the to-do list), plus all the Material density and padding choices — Roboto-ish proportions, a top-tab bar with the underline-by-color treatment, the standard square checkbox glyph. Each theme covers the usual ~25 UIIDs: base (Component, Form, ContentPane, Container), typography (Label, SecondaryLabel, TertiaryLabel, SpanLabel*), buttons (Button, RaisedButton, FlatButton with .pressed and .disabled), text input, selection controls, toolbar, tabs, side menu, list, MultiButton, dialog/sheet, FAB, and all the supporting separator and popup pieces. Both themes have full light and dark coverage. The shipping CSS sources sit in the repo at native-themes/ios-modern/theme.css and native-themes/android-material/theme.css for anyone who wants to read what each UIID is doing. iOS Modern This is the ShowcaseTheme capture from the new screenshot suite, run on iOS in light and dark. Same Form, same components, swap Display.setDarkMode(...) and re-resolve. The form is built like this: Java Container row = new Container(BoxLayout.x()); row.add(new Button("Default")); Button raised = new Button("Raised"); raised.setUIID("RaisedButton"); row.add(raised); form.add(row); TextField tf = new TextField("[email protected]"); form.add(tf); Container toggles = new Container(BoxLayout.x()); CheckBox cb = new CheckBox("Remember me"); cb.setSelected(true); toggles.add(cb); RadioButton rb = new RadioButton("Agree"); rb.setSelected(true); toggles.add(rb); form.add(toggles); SpanLabel body = new SpanLabel("Body copy …"); That gives you the full picture on one screen: The Default button uses the stock Button UIID. The Raised button uses RaisedButton, which cn1-derives from Button and adds a tinted pill on top of the iOS system blue — that is the iOS Modern accent in both modes.The TextField is a single rounded-rect surface with the iOS system gray fill, the same shape Apple uses in Settings.CheckBox and RadioButton use the new optional @checkBoxCheckedIconInt / @radioCheckedIconInt theme constants to swap to CHECK_CIRCLE / CHECK_CIRCLE_OUTLINE glyphs — Reminders-app aesthetic on iOS, while Android keeps the standard square check.The SpanLabel body uses the theme's base font and inherits transparent backgrounds, so it never paints over a translucent parent. The full-screen source is DarkLightShowcaseThemeScreenshotTest.java. Android Material 3 Same ShowcaseTheme source on Android. The Material 3 baseline palette gives Default the primary container color and Raised the elevated-surface tone, with the dark variant flipping the relationship correctly via the dark color-role mapping. Padding and font sizing follow Material density, which you can see in how compact the same Form lays out compared to iOS. Translucent Surfaces This is the DialogTheme capture against the screenshot suite's textured diagonal-stripe backdrop. The backdrop is intentional — it lets reviewers see whether anything that is supposed to be translucent actually is. The iOS Modern Dialog uses an rgba surface fill (0.78 alpha in light, 0.95 in dark — dark needs more opacity because bright stripes bleed through) and its DialogBody, DialogTitle, ContentPane, CommandArea sub-UIIDs are transparent, so the rounded corners read cleanly. The same trick is applied to TabsContainer and the iOS MultiButton. Runtime Palette Overrides The native theme is meant to be a starting point — you can layer your own palette on top without forking the theme. Above is the PaletteOverrideTheme capture: the base is iOS Modern, but the test layers a magenta palette on top at runtime via UIManager.addThemeProps(...). RaisedButton, FlatButton, the disabled tone, and the body-copy span all pick up the override in both light and dark — the override seam works at the resource-bundle layer, exactly the same mechanism a user theme uses to override the native theme on a real app. In the Simulator Three pieces, all live: Themes are bundled. The simulator jar-with-dependencies includes both modern themes alongside the four legacy themes (iPhoneTheme, iOS7Theme, androidTheme, android_holo_light) at the root of the jar. The simulator can pick any one of them at runtime without touching the skin repo.A new "Native Theme" menu. Right next to the Skins menu, there is now a Native Theme menu with a radio group for the six themes, plus "Auto" and "Use skin's embedded theme". Selecting one writes the simulatorNativeTheme Preference, flips the simulator-reload flag, and disposes the current window so the skin reloader kicks in with the new theme. You can sit on a single skin and flip through every native theme in seconds.Build hints know about it. The new nativeTheme, ios.themeMode, and and.themeMode build hints are registered with the simulator's Build Hints UI on launch — labels, types, value lists, descriptions, the lot. (The legacy keys cn1.nativeTheme and cn1.androidTheme are still honored for back-compat.) Set them in the Build Hints dialog, in codenameone_settings.properties, or via -D system properties; they flow through to the device build and the simulator, both. The "Auto" choice in the Native Theme menu defers to those build hints — set ios.themeMode=modern in your project's settings and "Auto" previews iOS Modern; flip the same project to ios.themeMode=ios7 and "Auto" previews iOS 7. The explicit menu entries (iOS Modern, iOS 7, etc.) override the hints regardless. -Dcn1.forceSimulatorTheme is still honored as the highest-priority override; pick "Use skin's embedded theme" to bypass the framework theme entirely and get whatever the skin shipped with. On Devices The opt-in is the same on iOS and Android. The platform knobs follow a single naming pattern — ios.themeMode and and.themeMode — and accept modern / liquid / auto / ios7 / flat on iOS, modern / material / auto / hololight / legacy on Android. There is a single cross-platform shortcut, nativeTheme=modern, which the iOS builder consults when ios.themeMode is unset and which the Android port reads at runtime as a default for and.themeMode. The legacy aliases cn1.androidTheme and cn1.nativeTheme are still honored for back-compat, as is and.hololight=true. The default for an existing app stays on legacy on every platform. We do not flip a 15-year-old app's look without an opt-in. New apps generated from the initializr ship with nativeTheme=modern, ios.themeMode=modern, and and.themeMode=modern already set in codenameone_settings.properties, so a brand new project starts with the modern themes preselected. The Playground does the same, and Playground project downloads carry the same defaults into the generated codenameone_settings.properties. The HTML5 port has the runtime support for the modern themes, but does not bundle them with user apps yet — that is one of the loose ends we want to close in the next round. Sticky Headers The other piece of look-and-feel that we want to highlight is StickyHeaderContainer, which finally has a proper home in the framework. It is the iOS-contacts-list / sectioned-material-list component: scroll past a section boundary, and the previous header is replaced by the next one. New this week, the swap is animated. A directional slide moves the outgoing header up on a forward scroll and down on a reverse scroll, or you can pick a cross-fade. Above is a six-frame sweep from the screenshot test — the user scrolls through sections A, B, C, D, E, and the pinned header recolors to whichever section is currently active at the top of the viewport. The API is small. Build the container, register sections with addSection(header, content), configure the transition style and duration, and add it to a Form: Java StickyHeaderContainer sticky = new StickyHeaderContainer(); sticky.setTransitionStyle(StickyHeaderContainer.TRANSITION_SLIDE); sticky.setTransitionDurationMillis(250); for (char c = 'A'; c <= 'Z'; c++) { Label header = new Label("" + c, "StickyHeader"); Container items = new Container(BoxLayout.y()); for (int i = 0; i < 5; i++) { items.add(new Label(c + " entry " + i)); } sticky.addSection(header, items); } TRANSITION_SLIDE is the default. TRANSITION_FADE cross-fades the outgoing header on top of the incoming one. TRANSITION_NONE keeps the prior instantaneous swap if you want it. Issue #4807 for the original request. How We Test This Every screenshot in this post is captured by a test that runs the app on a real iOS device, an Android emulator, and headless Chrome, then diffs each capture against a stored golden image. The diff is the test — if the rendered pixels drift, the run fails. For animations, the test grabs a series of frames over a fixed-duration transition, then composites them into a single index image. That is how the dual-appearance shots end up as one side-by-side picture per test: … and how the sticky-header animation ends up as a six-frame strip stitched into a GIF: If you want to read the source, the suite lives at scripts/hellocodenameone/common/src/main/java/com/codenameone/examples/hellocodenameone/tests/. Bugs and Misc Features From This Week The theme work was the loudest thing this week, but plenty of other commits landed alongside it: SIMD large-allocation fallback. The SIMD path on iOS allocates its working buffers on the stack via alloca for speed. Past a certain buffer size, the stack allocation simply fails — there is not enough stack to give, and the request crashes the process. The fix detects that case and falls back to a regular heap allocation when the request is too large to live on the stack. Small SIMD ops keep the fast alloca path; large ones no longer crash.Pluggable AnimationTime clock. Motion, Timeline, MorphAnimation, Image.animate, and Label tickers now all route through a new AnimationTime class that defaults to System.currentTimeMillis() but can be overridden. Tests can drive animations deterministically frame by frame; demos can run in slow motion or fast forward; Motion.slowMotion is no longer the only lever.POSIX character classes for non-ASCII letters. [[:alpha:]], [[:alnum:]], [[:lower:]], and [[:upper:]] silently failed to match anything outside the basic ASCII range — Greek, Cyrillic, CJK ideographs, accented letters, vulgar fractions, currency symbols. They now match the way you would expect, with five regression tests covering the failing cases from the issue.Fail-fast on JDK < 11. The simulator and "Run as desktop app" goals fork the JVM with --add-exports=java.desktop/com.apple.eawt=ALL-UNNAMED, which JDK 8 rejects with the unhelpful "Could not create the Java Virtual Machine". Now the Maven plugin checks the runtime JDK version on entry to cn1:run and cn1:debug and aborts with a friendly message naming the detected version, JAVA_HOME, and a pointer to Adoptium. JDK 11 through 25 is the supported runtime range for the simulator, JDK 8 stays the build-time requirement for the core framework, and JDK 8 is still fully supported at runtime for shipped desktop apps — only the simulator / "Run as desktop app" Maven goals require JDK 11+.Sheet scrolling, swipe, and animation. Sheet finally drags from the bottom with a real animation instead of snapping in. Issue #4825.Picker positioning. Picker got additional button-positioning options and a small batch of coverage tests.Playground polish. The Playground moved every Dialog.show(...) to InteractionDialog mode so user code calling Dialog.show does not blow away the editor chrome — it renders into the layered pane instead. Error messages got a substantial overhaul. The preview-resolution syntax expanded so the Playground can pick previews from a much wider set of expressions, with a new harness keeping it honest in CI.Deeper refreshTheme(). Form.refreshTheme() has been around forever — it re-resolves the styles on a single Form. The new thing this week is UIManager.getInstance().refreshTheme(), which snapshots the current theme props and theme constants, clears the resolved-style caches, and re-applies the lot. This is what lets the screenshot suite flip dark mode mid-suite and see fresh styles, and what lets a runtime palette override take effect immediately. Most apps will never need to call it directly — palettes typically don't change at runtime, and a Display.setDarkMode(...) call already triggers the right invalidation. It is there if you do change the palette and want the change to stick on the next paint without reloading the theme from disk. Where This Is Going — and a Thank-You Last week's post was about Codename One feeling faster: corrected pixel densities, principled scroll physics, SIMD on iOS, and accessibility text scaling. This week is the symbiotic other half — Codename One, looking like it belongs on a 2026 phone. Both halves are the same project. There is not much point in shipping a SIMD-accelerated Base64 if the surrounding UI looks like a 2014 app, and there is not much point in shipping a glass-frosted Dialog if the scroll underneath it judders. Neither half is finished. They are both ongoing, and they both depend on community help — bug reports, RFEs, the patient back-and-forth on issue threads where somebody describes a layout problem on an iPhone you do not own. A specific thank you to the people who drove the issues that turned into this week's commits: Thomas (@ThomasH99) filed #4781 (the original "build a liquid glass example" RFE that started this whole effort), #4807 (sticky headers), #4838 (sideways tab swipe), #4841 (the POSIX regex fix), #4819 (picker buttons), and several others; Francesco Galgani (@jsfan3) filed #4825 (sheet swipe animation) and #4824 (light + dark theme by default in initializr); @ddyer0 caught #4811 (the EDT stack overflow) and #4767 (iPad restart Form size); Lucca Biagi (@LuccaPrado) filed #4817 (form creation in IntelliJ). Several of those are RFEs you would not file unless you actually use the framework day-to-day, and that is the kind of feedback that turns into shippable work. We are sitting at 496 open issues as of this post. That is slow but steady progress — the number is moving in the right direction week over week, and the issues that close tend to ship as features or fixes you can see, not as silent triage. If you have a problem, file it. If you have an RFE, file that too. The themes you saw above started as an RFE. You can try the new themes today by opening the Playground by setting nativeTheme=modern (or ios.themeMode=modern / and.themeMode=modern for finer control) in your project's codenameone_settings.properties, or by picking them from the simulator's new Native Theme menu. New projects from the initializr already have them on. The shipping resources are bundled in the iOS and Android ports as of this week. More
Building Threat Intelligence Pipelines Using Python, APIs, and Elasticsearch

Building Threat Intelligence Pipelines Using Python, APIs, and Elasticsearch

By Krishnaveni Musku
Threat intelligence becomes operationally valuable when indicator data can be collected continuously, normalized into a consistent schema, and queried fast enough to support enrichment and detection workflows. Standardized exchange formats such as STIX and transport protocols such as TAXII exist specifically to make machine-readable cyber threat intelligence easier to distribute at scale, while preserving enough structure for downstream correlation and context. Operational Requirements That Shape Intelligence Pipelines A threat intelligence pipeline is best treated as data engineering with security-specific constraints: provenance must remain intact, updates and revocations must be representable, and “freshness” should be measurable rather than assumed. STIX is explicitly designed to model cyber threat intelligence using typed objects with attributes, and it supports building richer context by linking objects through relationships rather than shipping flat indicator lists. A practical pipeline design often separates raw ingestion from normalized storage. Raw ingestion preserves the original feed payload for auditability and reversibility, while normalized storage produces documents that are easy to match against telemetry. This split aligns with STIX’s modeling approach, where producers may publish Indicators expressed as STIX patterns and connect them to other objects through relationship constructs, enabling consumers to choose between lightweight atom extraction for matching and graph-style context for analysis. Pulling From TAXII and Other APIs Without Losing Provenance TAXII 2.1, published by OASIS Open, defines a RESTful API and related requirements for TAXII clients and servers to exchange cyber threat information in a scalable manner, with STIX 2.1 support described as mandatory to implement in the TAXII context. The IANA media type registration for application/taxii+json also documents that the older application/vnd.oasis.taxii+json name is a deprecated alias, which matters in real integrations because content negotiation and strict header validation vary by server implementation. TAXII 2.1 also formalized mechanics that directly affect pipeline correctness under load. The CTI documentation notes that TAXII 2.1 added limit and next URL parameters and updated content negotiation and media types, reflecting a move toward pagination patterns that can handle large or rapidly changing datasets more safely than item-based offset pagination. A Python pipeline can either implement paging logic directly or delegate it to a client library; the taxii2client project documents a TAXII 2.1 client API that uses application/taxii+json;version=2.1 for Accept handling and provides an as_pages helper for TAXII 2.1 endpoints that support pagination, including “Get Objects” and “Get Manifest.” Python def iter_taxii_objects(collection, cursor, page_size=2000): accept = "application/taxii+json;version=2.1" for page in as_pages(collection.get_objects, per_request=page_size, added_after=cursor, accept=accept): envelope = page if isinstance(page, dict) else page.json() for obj in envelope.get("objects", []): yield obj This pattern avoids embedding server-specific pagination tokens into pipeline logic while still enabling incremental collection reads. The cursor argument can be persisted as an ISO-8601 timestamp when the upstream provides a timestamp filter, a model commonly used by TAXII-feed vendors; for example, ESET documents STIX 2.1 feeds delivered via TAXII 2.1 collections and describes an added_after filter parameter for retrieving objects added after a specified timestamp, alongside retention constraints that make incremental pulls operationally necessary. Not all threat intelligence sources are TAXII-first. MISP Project documentation describes a REST-accessible STIX export capability and explicitly notes that STIX XML export can be slow and lead to timeouts with large events or collections, while STIX JSON avoids that regime, making JSON a more stable transport choice for high-volume automation. The same ecosystem provides a published OpenAPI specification and a dedicated converter library, misp-stix, which supports bidirectional conversion across STIX versions, including STIX 2.1, and includes features such as pattern parsing and indicator-observable fingerprinting, reducing the cost of maintaining bespoke mapping logic for every upstream source. Normalization Into ECS and STIX-Aware Semantics Normalization is where a pipeline either becomes queryable or becomes another archive. The Elastic Common Schema (ECS) threat field guidance explicitly frames threat.* as the mapping layer that normalizes threat intelligence indicators from many structures into consistent fields, and it links that normalization to detection and enrichment workflows such as indicator match rules. In particular, the guidance calls out normalizing indicators into threat.indicator.* so that disparate feeds can be queried consistently and used to build indicator matching logic without treating every provider as a special case. Atomic indicators benefit from being stored both as “typed value” and as vendor identifiers. ECS defines threat.indicator.type values aligned with cyber observable types and documents threat.indicator.id as a place to store indicator IDs, noting that a STIX 2.x indicator ID is a common approach and that the field can hold multiple values to represent the same indicator across systems. The practical implication is that a pipeline can preserve the upstream STIX identifier, attach a stable provider-local identifier when necessary, and still normalize the matchable indicator value into fields such as threat.indicator.ip or other threat.indicator.* subfields. Python def stix_confidence_to_nlmh(value): if value is None: return "Not Specified" v = int(value) if v == 0: return "None" if 1 <= v <= 29: return "Low" if 30 <= v <= 69: return "Medium" if 70 <= v <= 100: return "High" return "Not Specified" def extract_atomic_from_pattern(pattern): p = (pattern or "").strip() if "ipv4-addr:value" in p and "'" in p: return ("ipv4-addr", p.split("'")[1]) if "domain-name:value" in p and "'" in p: return ("domain-name", p.split("'")[1]) if "url:value" in p and "'" in p: return ("url", p.split("'")[1]) return (None, None) def stix_indicator_to_ecs(indicator_obj, provider, fetched_at_iso): itype, ivalue = extract_atomic_from_pattern(indicator_obj.get("pattern")) if not itype: return None doc = { "@timestamp": fetched_at_iso, "event": {"kind": "enrichment", "category": ["threat"], "type": ["indicator"]}, "threat": { "indicator": { "type": itype, "provider": provider, "name": indicator_obj.get("name") or ivalue, "description": indicator_obj.get("description"), "confidence": stix_confidence_to_nlmh(indicator_obj.get("confidence")), "reference": indicator_obj.get("id"), "id": [indicator_obj.get("id")], } }, "labels": {"feed": provider}, } if itype in {"ipv4-addr", "ipv6-addr"}: doc["threat"]["indicator"]["ip"] = ivalue return doc The extraction logic deliberately scopes itself to common “atomic” patterns to keep parsing deterministic and to minimize the risk of silently incorrect field derivation. This constraint matches the operational intent of ECS indicator guidance, which emphasizes consistent querying and reuse for indicator match rules after normalization, rather than attempting to fully interpret every possible composite STIX pattern in real time. Indexing Strategy in Elasticsearch That Avoids Accidental Cost Explosion Elasticsearch storage decisions are not purely operational preferences because they alter what update patterns are safe. Data streams consist of one or more hidden backing indices and require a matching index template; every document indexed into a data stream must include an @timestamp field mapped as a date-type (or date_nanos). Data streams are described as a good fit for most time-series use cases, while the documentation explicitly flags that frequent reuse of the same _id expecting last-write-wins can indicate a better fit for an index alias with a write index rather than a data stream. Threat intelligence pipelines often straddle that boundary: indicator state changes and revocations benefit from upsert semantics, while ingestion audits benefit from append-only history. Retention should be tied to query strategy. Elastic Security documentation warns that indicator match rules can consume significant resources and recommends limiting the indicator index query time range to the minimum necessary for coverage, with a default example query of the past 30 days. Even outside an alerting engine, a time-bounded indicator set tends to be operationally safer: it reduces scan cost, makes cache behavior more predictable, and avoids matching against long-expired infrastructure that is no longer relevant. When vendor retention is narrower, such as the 14-day retention window described for some TAXII feeds, the pipeline should persist that constraint as a policy and avoid relying on “full historical replay” as a recovery mechanism. Ingestion-Time Guardrails With Python, Ingest Pipelines, and Bulk Writes Ingest pipelines provide an explicit place to enforce normalization rules at ingest time. Elastic documentation describes ingest pipelines as a sequence of processors that run sequentially to transform data before it is indexed into a data stream or index, supporting operations such as removal, extraction, and enrichment. In addition, ingest processors can access ingest metadata under the _ingest key, and Elasticsearch notes that pipelines create _ingest.timestamp by default and that indexing ingest metadata requires explicitly setting it via a processor. JSON PUT /_ingest/pipeline/ti_normalize { "description": "Normalize threat intel indicators into ECS threat.indicator.*", "processors": [ { "set": { "field": "event.kind", "value": "enrichment" } }, { "set": { "field": "event.category", "value": ["threat"] } }, { "set": { "field": "event.type", "value": ["indicator"] } }, { "set": { "field": "event.ingested", "value": "{{{_ingest.timestamp}}" } }, { "fingerprint": { "fields": ["threat.indicator.provider", "threat.indicator.type", "threat.indicator.ip"], "target_field": "threat.indicator.fingerprint", "method": "SHA-256", "ignore_missing": true } } ] } Bulk ingestion should align with Elasticsearch’s wire format rules. The bulk API documentation describes NDJSON requirements, including that the final line must end with a newline character and that JSON actions and sources should not be pretty printed because newlines are literal delimiters. A Python producer can serialize documents into bulk batches, assign a deterministic _id derived from provider and atomic indicator value to make writes idempotent, and optionally route documents through the normalization pipeline configured above. Python def build_indicator_id(provider, itype, ivalue): return (provider + ":" + itype + ":" + ivalue).lower() def bulk_index_indicators(es_http, index_name, docs): lines = [] for d in docs: ti = d.get("threat", {}).get("indicator", {}) doc_id = build_indicator_id(ti.get("provider", "unknown"), ti.get("type", "unknown"), ti.get("ip", ti.get("name", "unknown"))) lines.append(encode_json({"index": {"_index": index_name, "_id": doc_id, "pipeline": "ti_normalize"})) lines.append(encode_json(d)) payload = "\n".join(lines) + "\n" return es_http.post("/_bulk", body=payload, headers={"Content-Type": "application/x-ndjson"}) The NDJSON newline termination is not optional, so building the payload in a way that always emits a trailing newline avoids a class of partial-ingest failures that are hard to diagnose under load. For enrichment use cases, ingest-time join behavior should be applied cautiously: Elastic warns that the enrich processor can impact ingest speed, recommends benchmarking, and explicitly states that it is not recommended for appending real-time data, instead working best with reference data that does not change frequently. This guidance aligns with threat intelligence practice: fast-changing indicators typically work better as a queried dataset, joined at search or detection time, rather than as an ingest-time enrichment applied to every event. Conclusion A threat intelligence pipeline built on Python, APIs, and Elasticsearch becomes reliable when it treats schemas, media types, and update semantics as core engineering concerns instead of integration details. STIX and TAXII provide standard object modeling and transport expectations, including content negotiation and pagination mechanics, while ECS provides a target schema that makes indicators consistently queryable and directly usable by matching workflows such as indicator match rules. High-quality implementations preserve provenance, normalize into threat.indicator.* with STIX-aligned confidence semantics, choose an indexing strategy that matches expected update patterns, and enforce ingestion guardrails through ingest pipelines, simulation, and NDJSON-correct bulk writes. More
Optimizing Databricks Spark Pipelines Using Declarative Patterns
Optimizing Databricks Spark Pipelines Using Declarative Patterns
By Seshendranath Balla Venkata
Zero-Downtime Deployments for Java Apps on Kubernetes
Zero-Downtime Deployments for Java Apps on Kubernetes
By Ramya vani Rayala
Stateless JWT Auth Microservice Architecture With Spring Boot 3 and Redis Sentinel
Stateless JWT Auth Microservice Architecture With Spring Boot 3 and Redis Sentinel
By Erkin Karanlık
Kafka and Spark Structured Streaming in Enterprise: The Patterns That Hold Up Under Pressure
Kafka and Spark Structured Streaming in Enterprise: The Patterns That Hold Up Under Pressure

I've been running Kafka and Spark Structured Streaming together in production for about five years. Not in demo environments or proof-of-concept projects. In systems processing insurance claims, manufacturing telemetry, and financial transaction data, with SLAs and compliance requirements, and people who call you at 2 AM when things break. There's a version of Kafka plus Spark Structured Streaming that looks elegant in architecture diagrams and falls apart in the first month of production. There's another version that's uglier in places but genuinely reliable. Here is what I've learned about the difference. Getting Checkpointing Right From the Start In my experience, checkpointing is non-negotiable for any streaming job that needs recovery. But checkpointing to local disk, which is the easiest configuration, means your streaming job can't recover from a node failure, only from a process restart. Checkpoint location must be on durable shared storage, ADLS Gen2 or equivalent, from the first day in production. The checkpoint contains the Kafka offsets that have been committed and the state store for stateful operations. Changing either of these, whether by manually deleting the checkpoint or by changing the query name, will reset your consumer offsets. I've seen this happen accidentally twice: once when an engineer thought deleting a stale checkpoint directory was a cleanup operation, and once when a code refactoring changed the query name used as the checkpoint key. Both required manual offset reconstruction from Kafka's own offset storage. Neither was catastrophic, but both were stressful and avoidable. Micro-Batch Sizing for Your Use Case Spark Structured Streaming processes data in micro-batches. The trigger interval controls how often a micro-batch runs. The default, if you don't specify a trigger, is to run a new batch immediately after the previous one completes. This is correct for high-throughput workloads where you want to process data as fast as possible. It's wrong for moderate-throughput workloads where you want predictable latency and manageable file sizes in your output Delta Lake tables. For our manufacturing telemetry pipeline (moderate throughput, near-real-time requirement), we use a 30-second trigger. This produces files of roughly 50 to 100MB in the output Delta table, which is manageable with a nightly compaction job. For our insurance claims pipeline (lower throughput, 5-minute SLA), we use a 2-minute trigger. My rule of thumb: choose a trigger interval that produces output files in the 50 to 500MB range for your throughput. Files significantly smaller than this create compaction debt. Files significantly larger than this create memory pressure during the micro-batch. Python # Trigger interval examples for different workloads # High throughput: process as fast as possible high_throughput_query = df.writeStream \ .trigger(availableNow=True) # Spark 3.3+: process all, then stop # Moderate throughput (manufacturing telemetry): 30-second batches telemetry_query = df.writeStream \ .trigger(processingTime="30 seconds") \ .outputMode("append") \ .format("delta") \ .option("checkpointLocation", checkpoint_path) \ .start(output_path) # Low throughput (insurance claims): 2-minute batches claims_query = df.writeStream \ .trigger(processingTime="2 minutes") \ .outputMode("append") \ .format("delta") \ .option("checkpointLocation", checkpoint_path) \ .start(output_path) Kafka Partition Count and Spark Parallelism Each Kafka partition is consumed by one Spark task per micro-batch. If your topic has 8 partitions, Spark uses 8 tasks for the Kafka read stage. If your downstream processing is more CPU-intensive than the Kafka read, you'll want more parallelism downstream. Use repartition() after the Kafka source read to increase parallelism for the heavy processing stages. In the other direction: if your Kafka topic has 200 partitions because it was sized for high throughput, but your Spark cluster has 32 cores, you're trying to run 200 tasks across 32 cores with significant context switching overhead. Consider whether the partition count on the topic is appropriate for your actual throughput. Stateful Operations and Watermarks Windowed aggregations and stream-stream joins require Spark to maintain state across micro-batches. Without a watermark, Spark will accumulate state indefinitely, and your executor memory will grow without bound until the job fails. Always define a watermark on your event-time column for any stateful operation. The watermark threshold is a business decision as much as a technical one. A 10-minute watermark means Spark will discard events that arrive more than 10 minutes after the event time they are associated with. If your source systems can deliver events up to 30 minutes late (common in some IoT and batch-settlement scenarios), a 10-minute watermark will cause late events to be silently dropped. Understand your source latency characteristics before setting the watermark. Python # Watermark definition for late-arriving events from pyspark.sql.functions import from_json, col, window claims_parsed = kafka_df \ .select(from_json(col("value").cast("string"), claims_schema) .alias("data")) \ .select("data.*") \ .withWatermark("event_timestamp", "30 minutes") # Windowed aggregation with watermark claims_hourly = claims_parsed \ .groupBy( window("event_timestamp", "1 hour"), "claim_type", "region" ) \ .agg( count("claim_id").alias("claim_count"), sum("claim_amount").alias("total_amount"), avg("claim_amount").alias("avg_amount") ) The Monitoring You Need on Day One First up: consumer lag per partition. This is the most important streaming metric. Growing lag means your consumer can't keep up with producer throughput, and your latency SLA is in jeopardy. Second: micro-batch duration. If micro-batch duration exceeds your trigger interval, you have a processing bottleneck. The job is trying to run continuously without keeping up. Third: state store size for stateful operations. A growing state store is a memory leak waiting to become an OOM failure. My team emits these three metrics from every streaming job to Azure Monitor. When any of them crosses a threshold, we get an alert before users notice a problem. Setting this up properly at deployment time, not after the first production incident, has saved us from several avoidable outages. Python # Azure Monitor metrics emission from Spark streaming from pyspark.sql.streaming import StreamingQueryListener from opencensus.ext.azure import metrics_exporter class StreamingMetricsListener(StreamingQueryListener): def __init__(self, app_insights_key): self.exporter = metrics_exporter.new_metrics_exporter( connection_string=f"InstrumentationKey={app_insights_key}") def onQueryProgress(self, event): p = event.progress self.emit("consumer_lag", p.sources[0].endOffset - p.sources[0].startOffset) self.emit("batch_duration_ms", p.batchDuration) self.emit("state_store_rows", p.stateOperators[0].numRowsTotal if p.stateOperators else 0) def emit(self, name, value): # Send to Azure Monitor / Application Insights self.exporter.export_metrics([{ "name": f"streaming.{name}", "value": value, "timestamp": datetime.utcnow() }]) spark.streams.addListener(StreamingMetricsListener(AI_KEY))

By Kuladeep Sandra
Building a Reusable Framework to Standardize API Ingestion in an On-Prem Lakehouse
Building a Reusable Framework to Standardize API Ingestion in an On-Prem Lakehouse

In many enterprise lakehouse environments, the biggest ingestion challenge is not data volume; it is inconsistency. As platforms grow, data starts arriving from many different systems through REST APIs, SOAP services, SFTP drops, database extracts, queues, and other interfaces. In many teams, these integrations are built one by one to solve immediate business needs. Over time, that creates a fragmented connector landscape where every source behaves a little differently. One connector may implement retries one way. Another may use a different authentication pattern. A third may handle pagination, validation, and failures entirely differently. The result is a platform that works, but becomes harder to operate, extend, and support as the number of sources increases. That is the problem this framework was designed to solve. The Core Problem Custom connectors are easy to justify in the short term. Each one is tailored to its source and can be delivered quickly. But as the number of integrations grows, so does the operational burden. Failures become harder to troubleshoot because each connector has its own behavior. Onboarding new engineers takes longer because they must learn multiple implementation styles. Adding new sources becomes slower because common concerns, such as retry handling, pagination, authentication integration, validation, and dead-letter routing, are repeatedly rebuilt. At that point, the issue is no longer ingestion itself. It is the lack of a consistent ingestion model. The Design Objective The goal was to create a reusable ingestion framework that made new source onboarding simple, operational behavior consistent, and maintenance easier over time. The framework needed to do four things well: Standardize common operational concernsMinimize source-specific codeRemain easy to debugSupport a wide range of batch and near-batch integrations without becoming overly abstract The solution was a connector framework built around a base class, declarative configuration, and clearly defined lifecycle hooks. In this model, engineers only need to implement two source-specific methods: fetch() to call the source and retrieve raw dataparse() to convert that raw payload into a DataFrame with a defined schema Everything else is managed by the framework. The Connector Pattern The base connector owns the cross-cutting concerns that should not be reimplemented for every source. That includes: Session creationAuthentication resolutionPagination orchestrationRetry handlingRate-limit enforcementValidationWriting valid recordsRouting invalid records to dead-letter storage A simplified version of the pattern looks like this: Python from abc import ABC, abstractmethod from pyspark.sql import DataFrame, SparkSession import requests import yaml class BaseApiConnector(ABC): def __init__(self, config_path: str, spark: SparkSession): with open(config_path) as f: self.config = yaml.safe_load(f) self.spark = spark self.session = self._build_session() def _build_session(self) -> requests.Session: session = requests.Session() session.headers.update(self.config.get("headers", {})) auth = self._resolve_auth() if auth: session.auth = auth return session @abstractmethod def fetch(self, params: dict) -> dict: """Call the source and return raw response data.""" ... @abstractmethod def parse(self, raw_data: dict) -> DataFrame: """Convert raw response into a DataFrame.""" ... def run(self): all_data = self._paginated_fetch() df = self.parse(all_data) validated, dead = self._validate(df) self._write(validated) self._write_dead_letters(dead) The key idea is the separation of responsibility. The framework owns common ingestion behavior. The connector implementation only owns source-specific logic. That keeps the design simple without forcing every connector to solve the same operational problems repeatedly. Why Declarative Configuration Matters A reusable framework only works if source behavior can be defined without constantly changing code. Each source is therefore described through configuration. A typical configuration includes: Source metadataConnection settingsAuthentication referencePagination strategySchema expectationsRetry overridesRate-limit settings For example: YAML source: name: customer-api type: rest_api schedule: "every few hours" connection: base_url: https://example-api.company.com auth: type: oauth2_client_credentials secret_reference: customer-api-credentials timeout_seconds: 30 pagination: type: cursor cursor_field: "nextPageToken" page_size: 1000 schema: required_columns: [id, name, status, created_at] output_path: /bronze/domain/entity format: parquet retry: max_attempts: 3 backoff_strategy: exponential rate_limit: requests_per_second: 5 This approach has two major advantages. First, it reduces the amount of code required to onboard a new source. Second, it makes source behavior more transparent. Engineers can understand how a connector behaves by reading its configuration rather than tracing through custom implementations. Sensitive values should not be stored directly in configuration. Instead, configuration should reference a centralized secret management mechanism and resolve credentials securely at runtime. Standardizing the Right Things Not every part of ingestion should be configurable, and not every part should be customized. The framework works best when it standardizes the concerns that are common across most sources. Pagination Most APIs use a limited number of pagination styles, usually cursor-based, offset-based, or token-based pagination. Because those patterns are common, pagination belongs in the framework rather than in each connector. Retry Handling Retry behavior should also be standardized. Transient failures such as throttling and temporary service errors usually deserve automatic retries. Permanent client-side failures should typically fail fast. Centralizing this logic reduces inconsistency and improves predictability. Rate Limiting Request pacing is another concern that should not be reimplemented per connector. Framework-level rate limiting helps protect upstream systems and reduces the likelihood of unnecessary throttling. Validation and Dead-Letter Routing Data quality handling is often inconsistent in connector-heavy platforms. Standard validation and dead-letter handling make ingestion outcomes easier to monitor and troubleshoot. Onboarding a New Source Once the framework is in place, adding a new source becomes much simpler. A typical connector implementation may look like this: Python class CustomerAccountsConnector(BaseApiConnector): def fetch(self, params: dict) -> dict: endpoint = f"{self.config['connection']['base_url']}/accounts" response = self.session.get(endpoint, params=params) response.raise_for_status() return response.json() def parse(self, raw_data: dict) -> DataFrame: records = raw_data.get("records", []) return self.spark.createDataFrame(records) That is often all that is needed for a standard API integration. The connector focuses only on extracting and parsing the source response. The framework handles the operational lifecycle around it. This is where the real value starts to show. The benefit is not just fewer lines of code. It is that every new source behaves in a familiar way. What Improves in Practice The biggest gain from a reusable ingestion framework is predictability. When all connectors follow the same execution model: Support becomes easier because failure patterns are more consistentOnboarding improves because engineers learn one framework instead of many connector stylesMaintenance effort drops because shared concerns are fixed once in the frameworkSource onboarding becomes faster because teams are not rebuilding the same plumbing repeatedly The framework also creates a cleaner boundary between ingestion and transformation. Its job is to land validated raw or near-raw data reliably. Transformations belong downstream, where they can evolve independently without complicating ingestion logic. That separation makes both layers easier to manage. What This Framework Is Not For One of the most important design decisions in framework development is deciding what not to support. This pattern is a strong fit for batch and near-batch ingestion, especially for API- and file-oriented integrations. It is not the right solution for every workload. For example, it is usually not the best fit for: Complex transformations tightly coupled with extractionVery high-throughput streaming workloadsUse cases better served by dedicated streaming or CDC platforms Those are not shortcomings. They are intentional boundaries. A framework becomes more effective when its purpose is clear. Final Thoughts A good ingestion framework is not just about code reuse. It is about operational consistency. If every new source requires its own retry model, its own pagination implementation, and its own failure-handling logic, the platform will become harder to support with every additional connector. Standardizing those behaviors through a reusable framework creates a more scalable operating model. The most valuable outcome is not technical elegance. It is reducing variability. When source onboarding becomes more repeatable, support becomes more predictable, and connector behavior becomes easier to understand, the entire platform becomes easier to scale. That is what a reusable ingestion framework should really deliver.

By Kuladeep Sandra
Zone-Free Angular: Unlocking High-Performance Change Detection With Signals and Modern Reactivity
Zone-Free Angular: Unlocking High-Performance Change Detection With Signals and Modern Reactivity

Angular’s move toward zoneless change detection is a change in scheduling semantics rather than a removal of change detection. Instead of using Zone.js to infer that a render pass might be needed whenever certain asynchronous work completes, Angular schedules change detection from explicit framework notifications and from reactive state updates that Angular can track. The Angular performance guide states that zoneless is the default in Angular v21+, and it documents provideZonelessChangeDetection() as the bootstrapping hook used to enable zoneless scheduling in Angular v20. Why Zoneless Became the Default Angular’s official guidance frames Zone.js as a source of unnecessary synchronization. Zone.js uses DOM events and async tasks as indicators that the application state might have updated and triggers application synchronization to run change detection, while lacking insight into whether the state actually changed, so synchronization is triggered more frequently than necessary. The same guidance connects Zone.js to payload and startup overhead, debugging friction, and ecosystem compatibility risks that arise from patching native APIs, including the explicit note that some APIs cannot be patched effectively, such as async/await, which must be downleveled to work with Zone.js. Angular’s v21 release announcement describes the maturity path behind the default, positioning zoneless change detection as progressing from experimental availability in v18 through stabilization in v20.2 and then becoming the default in v21, with zone.js and its features no longer included by default in Angular applications. The same announcement lists expected outcomes such as better Core Web Vitals, ecosystem compatibility, reduced bundle size, easier debugging, and better control over when change detection runs. The Zoneless Notification Contract Zoneless mode replaces patch-driven inference with an explicit notification surface. The provideZonelessChangeDetection() API documents configuring Angular not to use Zone.js state changes to schedule change detection and states that this works whether Zone.js is absent or present because another library depends on it. The same API documentation enumerates which notifications schedule change detection in a zoneless runtime, including ChangeDetectorRef.markForCheck(), ComponentRef.setInput(), updating a signal read in a template, triggers from bound host or template listener callbacks, attaching a dirty view, removing a view, and registering a render hook. The zoneless performance guide reinforces the same contract and connects it to code patterns used in real applications. Angular relies on notifications from core APIs to determine when to run change detection and on which views, and it calls out that AsyncPipe is an important compatibility mechanism because it calls markForCheck() automatically. The same guide recommends OnPush as a step toward zoneless compatibility and documents removing Zone.js from builds by adjusting polyfills configuration for both build and test targets and uninstalling the dependency. TypeScript bootstrapApplication(AppComponent, { providers: [provideZonelessChangeDetection()], }); Angular also documents an explicit opt-in back to zone-based scheduling when required. The provideZoneChangeDetection() API is described as enabling NgZone/Zone.js-based change detection and as supporting configuration such as eventCoalescing, which can matter when dependencies still assume the older scheduler or when existing runtime behavior must remain stable while migration proceeds incrementally. Signals as Modern Reactivity for Targeted Updates Signals make the notification surface usable for everyday UI state. Angular documents writable signals as getter functions and documents that template rendering is a reactive context in which Angular monitors signal reads to establish dependencies. The signals guide also documents computed signals as lazily evaluated and memoized read-only derivations, with dynamic dependency tracking based on which signals are actually read during evaluation. In a zoneless runtime, this model aligns directly with the scheduling contract because updating a signal read in a template is itself a documented change detection trigger. A minimal component sketch illustrates how event notifications and signal updates align with zoneless scheduling. A click handler is a bound template listener callback and, therefore, a documented scheduling trigger, and it updates a writable signal consumed by the template, which is another documented trigger. Pairing this with OnPush aligns with Angular’s recommendation for zoneless compatibility and reduces reliance on incidental global checks. TypeScript @Component({ changeDetection: ChangeDetectionStrategy.OnPush, template: ` <button (click)="increment()">+</button> <span>{{ count() }</span> <span>{{ doubled() }</span> `, }) export class CounterComponent { readonly count = signal(0); readonly doubled = computed(() => this.count() * 2); increment() { this.count.update((v) => v + 1); } } Signals also make certain correctness constraints more visible because fewer incidental change detection passes exist to hide missing notification paths. The signals guide explicitly warns that readonly signals do not prevent deep mutation of their value and documents that the reactive context is only active for synchronous code, meaning signal reads after an asynchronous boundary are not tracked as dependencies. It also documents untracked() as a tool for preventing incidental dependency edges inside computed() and effect(), which becomes increasingly important as signal graphs grow in size and complexity. Interop, SSR Stability, Forms, and Test Behavior Angular’s RxJS interop completes the signals in templates approach for Observable-based services. The toSignal() API is documented as subscribing to an Observable and returning a signal that provides synchronous access to the most recent emitted value, throwing if the Observable errors. The RxJS interop guide adds operational constraints that frequently matter during zoneless migration: toSignal() subscribes immediately (similar to the async pipe), automatically unsubscribes when the creating component or service is destroyed, and should not be called repeatedly for the same Observable. TypeScript @Component({ changeDetection: ChangeDetectionStrategy.OnPush, template: `{{ user()?.displayName ?? 'Loading…' }`, }) export class UserBadgeComponent { readonly user = toSignal(inject(UserService).user$, { initialValue: null }); } Zoneless scheduling also changes how application stability and model-driven subsystems must communicate with rendering. Angular’s guide states that SSR has relied on Zone.js to determine when an application is stable enough to serialize and documents using the PendingTasks service to make Angular aware of asynchronous work that should delay serialization in a zoneless runtime, including the pendingUntilEvent helper for Observables. The same guide calls out reactive forms: model updates such as setValue, patchValue, and similar APIs emit forms observables but do not automatically schedule component change detection, so the recommendation is to connect forms observables to a change detection notification (for example markForCheck()) or reflect the relevant state through signals consumed by templates. The guide also documents that TestBed uses Zone-based change detection by default when zone.js is loaded via polyfills, describes forcing zoneless behavior in tests by adding provideZonelessChangeDetection(), recommends minimizing fixture.detectChanges() when the goal is to validate real notification paths, and points to debug support via provideCheckNoChangesConfig({ exhaustive: true, interval: <milliseconds> }). Conclusion Zone-free Angular replaces patch-driven inference with an explicit notification surface and a reactive state model that Angular can track at the template boundary. Primary sources describe how Zone.js-driven inference triggers synchronization more often than necessary because async activity does not reliably correlate with state changes, and they also describe patching overhead and a maintenance posture that limits further patch expansion as Angular shifts away from Zone.js. Zoneless scheduling makes rendering causes explicit and predictable, and signals plus RxJS interop utilities such as toSignal() provide the production-facing primitives needed to keep UI updates fast, targeted, and sustainable as application scale and async complexity increase.

By Bhanu Sekhar Guttikonda DZone Core CORE
Ujorm3: A New Lightweight ORM for JavaBeans and Records
Ujorm3: A New Lightweight ORM for JavaBeans and Records

"Do the simplest thing that could possibly work." — Kent Beck, creator of Extreme Programming and pioneer of Test-Driven Development. I believe the Java language architects didn't exactly hit the mark when designing the API for the original JDBC library for database operations. As a result, a significant number of various libraries and frameworks have emerged in the Java ecosystem, differing in their approach, level of complexity, and quality. I would like to introduce you to a brand-new lightweight ORM library, Ujorm3, which I believe beats the competition thanks to its simplicity, transparent behavior, and low overhead. The goal of this project is to offer a reliable, safe, efficient, and easy-to-understand tool for working with relational databases without hidden magic and complex abstractions that often complicate both debugging and performance. The final release is now available in the Maven Central Repository, released under the free Apache License 2.0. The library builds on the familiar principles of JDBC but adds a thin layer of a user-friendly API on top. It works with clean, stateless objects and native SQL, so the developer has full control over what is actually executed in the database. Ujorm3 deliberately avoids implementing SQL dialects and instead uses native SQL complemented by type-safe tools for mapping database results to Java objects. It does not cache the results of any user queries. To achieve maximum speed, however, Ujorm3 retains certain metadata. Application API Classes The library offers a type-safe SelectQuery builder for constructing SQL SELECT queries smoothly in Java, while still fully supporting the classic SqlQuery for writing raw native SQL. Both approaches utilize the generated Meta classes for mapping and aliases, preventing SQL typos and ensuring compile-time safety. The SelectQuery automatically generates JOIN clauses based on the entity metadata. The type of join is determined by the @JoinColumn annotation: INNER JOIN: Used when the attribute is marked as mandatory (e.g., @JoinColumn(nullable = false)).LEFT JOIN: Used by default or when the attribute is explicitly marked as nullable (e.g., @Nullable or @JoinColumn(nullable = true)). Data filtering can be defined using the where() method, which accepts a Criterion object. This object can represent a complex logical structure in the form of a binary tree, providing a clear and type-safe way to build nested conditions. Automatically generated Meta* classes enable safe column mapping without the use of typo-prone text strings. The use of a SELECT statement can then look like this, for example: Java final EntityContext CTX = EntityContext.ofDefault(); final EntityManager<Employee, Long> EMPLOYEE_EM = CTX.entityManager(Employee.class); List<Employee> select() { return SelectQuery.run(connection(), EMPLOYEE_EM, query -> query .sql("SELECT") // Optional: "SELECT" is the default .columnsOfDomain(true) .column(MetaEmployee.city, MetaCity.name) // INNER JOIN (nullable = false) .column(MetaEmployee.city, MetaCity.countryCode) .column(MetaEmployee.boss, MetaEmployee.name) // LEFT JOIN (nullable = true) .where(MetaEmployee.id.whereGe(1L)) .tail("ORDER BY", MetaEmployee.id) .toList() ); } If you need full control over building the SQL SELECT statement, use the SqlQuery class. This class provides an API with methods for type-safe insertion of database columns or just their labels. The individual approaches differ only in the way the SQL query is constructed. Regardless of the chosen approach, the database columns are ultimately mapped to entities using column aliases in the format: "city.name". The resulting ResultSet is also mapped to entities using this same mechanism via the ResultSetMapper class. The EntityManager is used for working with entities, providing simple CRUD operations — including batch commands — through a Crud object. An interesting feature is the possibility of partial updates — the developer can specify an enumeration of columns to be updated, or pass the original object to the library, from which it will infer the changes itself. The mentioned classes are illustrated in a simplified class diagram. All listed methods are public: Performance Ujorm3 achieves great results in benchmark tests, where it is compared with some popular ORM libraries. The mechanism of writing values to domain objects also contributes to the good score. Instead of the traditional approach using Java reflection, the library generates and compiles its own classes at runtime. Such an approach generally reduces memory requirements, minimizes overhead, and saves work for the Garbage Collector. The library has no dependencies on external libraries, and the compiled benchmark module (including the Ujorm3 library itself) is less than 3 MB, which is advantageous for microservices and embedded environments. However, it is good to keep in mind that in a production environment, in conjunction with slower databases, the differences in performance may partially blur. Getting Started To try the library in your Java 17+ project, simply add the dependency to your Maven configuration: XML <dependency> <groupId>org.ujorm</groupId> <artifactId>ujo-core</artifactId> <version>3.0.0</version> </dependency> <dependency> <groupId>org.ujorm</groupId> <artifactId>ujorm-orm</artifactId> <version>3.0.0</version> </dependency> To automatically generate metamodel classes, add the optional APT configuration to the build element: XML <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.14.1</version> <configuration> <annotationProcessorPaths> <path> <groupId>org.ujorm</groupId> <artifactId>ujorm-meta-processor</artifactId> <version>3.0.0</version> </path> </annotationProcessorPaths> </configuration> </plugin> </plugins> The Ujorm module from the Benchmark project can be used as a template for a sample implementation. The library's codebase is currently covered by JUnit tests that utilize an in-memory H2 database (in addition to mocked objects). Before releasing the final version, I plan to add integration tests for PostgreSQL, MySQL, Oracle, and MS SQL Server databases. Useful Links Project HomepagePetStore DemoBenchmark TestsMore Examples as a JUnit Test

By Pavel Ponec
OpenAPI From Code With Spring and Java: A Recipe for Your CI
OpenAPI From Code With Spring and Java: A Recipe for Your CI

This is not "just another article about Springdoc," I promise. This is a ready-to-use recipe I was struggling to find one day, and had to build it from scratch. Have you ever needed to generate OpenAPI documentation directly from your code and, more importantly, do it in a way that fits cleanly into a CI pipeline? Swagger UI is commonly used in Spring Boot applications to visualize and test APIs from the browser. It can also expose the generated OpenAPI definition through a configurable endpoint, and that endpoint is exactly what we will use in this article. Why OpenAPI Documentation Matters Frontend Client Generation One of the most practical uses of OpenAPI documentation is automatic client generation. Tools such as OpenAPI Generator or Swagger Codegen can take an OpenAPI definition and produce TypeScript, JavaScript, or Java clients with very little manual effort. Mocking a Service Before It Is Ready In early development stages, a team may want to spin up a mock server before the real endpoints are fully implemented. Tools such as Mockoon or WireMock can use an OpenAPI specification to simulate the service. This is especially useful for frontend teams that need to move forward while backend work is still in progress. Verifying Contracts Between Services When multiple services depend on one another, compatibility becomes critical. OpenAPI documentation can be used together with tools such as Spring Cloud Contract to verify that both providers and consumers still conform to the agreed contract. The Manual Approach to Generating OpenAPI Documentation Let us start with a simple Spring Boot project. Add the following dependencies to pom.xml: XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-security</artifactId> </dependency> <dependency> <groupId>org.springdoc</groupId> <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId> <version>2.6.0</version> </dependency> Then add Springdoc configuration to application.yml: YAML springdoc: api-docs: path: /api-docs enabled: true swagger-ui: url: /api-docs enabled: true Now create a simple REST controller: Java @RestController @Tag(name = "default", description = "General API") @RequestMapping("/api/v1/default") public class WebRestController { private static final Logger log = LoggerFactory.getLogger(WebRestController.class); @GetMapping(produces = MediaType.TEXT_PLAIN_VALUE) @ResponseStatus(HttpStatus.OK) public String get() { log.info("GET method called"); return "Hello!"; } @PostMapping( consumes = MediaType.TEXT_PLAIN_VALUE, produces = MediaType.APPLICATION_JSON_VALUE ) @ResponseStatus(HttpStatus.OK) public Set<String> post(@RequestBody String body) { log.info("POST method called"); return Set.of(body); } Finally, add a security configuration that allows access to both the REST API and to Swagger UI: Java @Configuration @EnableWebSecurity @EnableMethodSecurity public class WebSecurityConfig { @Profile("!openapi") @Bean public SecurityFilterChain filterChain(HttpSecurity httpSecurity) throws Exception { return httpSecurity.authorizeHttpRequests( request -> request .requestMatchers("/api-docs", "/api-docs/**").permitAll() .requestMatchers("/swagger-ui/*").permitAll() .requestMatchers("/api/v1/default").permitAll() .requestMatchers("/**").authenticated() ) .csrf(CsrfConfigurer::disable) .build(); } @Profile("openapi") @Bean public SecurityFilterChain filterChainOpenApi(HttpSecurity httpSecurity) throws Exception { return httpSecurity.authorizeHttpRequests( request -> request.anyRequest().permitAll() ) .csrf(CsrfConfigurer::disable) .build(); } Notice the separate openapi profile. We will use it later during automated generation. At this point, you can run the application and open Swagger UI at http://localhost:8080/swagger-ui/index.html. From there, the generated OpenAPI document is available at http://localhost:8080/api-docs. You can save that response manually and use it as your specification file. This works, but it is repetitive and not very practical for build automation. So let us move to the more useful approach: generating the spec during the Maven build. Automatic Generation To generate an OpenAPI file automatically, it helps to understand what actually happens during the build. The springdoc-openapi-maven-plugin does not generate the specification out of thin air. It calls the application endpoint that exposes the OpenAPI definition. In other words, your Spring Boot application must be running while the plugin executes. That is why the spring-boot-maven-plugin and springdoc-openapi-maven-plugin are typically used together. Because the application has to be started during the build, the security configuration must also allow the documentation endpoint to be accessed in that scenario. This is exactly why the separate openapi Spring profile is useful. Add a Dedicated Maven Profile Add the following Maven profile to pom.xml: XML <profile> <id>openapi</id> <properties> <maven.test.skip>true</maven.test.skip> </properties> <build> <plugins> <!-- When the Maven profile is openapi, run Spring with the openapi profile --> <plugin> <artifactId>spring-boot-maven-plugin</artifactId> <groupId>org.springframework.boot</groupId> <configuration> <jvmArguments> -Dspring.application.admin.enabled=true -Dspring.profiles.active=openapi </jvmArguments> </configuration> <executions> <execution> <id>pre-integration-test</id> <goals> <goal>start</goal> </goals> </execution> <execution> <id>post-integration-test</id> <goals> <goal>stop</goal> </goals> </execution> </executions> </plugin> <!-- Generate the OpenAPI file during the build --> <plugin> <artifactId>springdoc-openapi-maven-plugin</artifactId> <groupId>org.springdoc</groupId> <version>1.4</version> <configuration> <skip>false</skip> <apiDocsUrl>http://localhost:8080/api-docs.yaml</apiDocsUrl> <outputDir>${project.build.directory}</outputDir> <outputFileName>openapi.yml</outputFileName> </configuration> <executions> <execution> <id>integration-test</id> <goals> <goal>generate</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </profile> The important parts here are: We create openapi Maven and openapi Spring profiles, but they are not the same (and should not necessarily have those exact names or share one name).When openapi Maven profile is run, we run Spring app with openapi profile (look at jvmArguments)-Dspring.profiles.active=openapi enables the relaxed security profile created specifically for documentation generation.apiDocsUrl points to the endpoint that returns the OpenAPI document.outputDir and outputFileName control where the generated file is written. These are the exact parts I struggled to find in one place, hence the "recipe" article. Run the Generation Step Once the profile is in place, generating the spec is easy: Shell ./mvnw verify -Popenapi After the build completes, the generated OpenAPI spec should be here: YAML ./target/openapi.yml Using It in a CI Pipeline This setup is CI-friendly because the same command can run locally and in your pipeline: YAML ./mvnw verify -Popenapi From there you can archive target/openapi.yml as a build artifact, publish it to an artifact repository, pass it to frontend code generators, mock servers, and contract verification jobs. Conclusion Generating OpenAPI documentation manually from Swagger UI is fine for quick inspection, but it does not scale well when you need repeatability. By wiring Spring Boot and Springdoc into a dedicated Maven profile, you can generate the specification automatically during the build in your CI. That gives you a reliable OpenAPI artifact that can support client generation, service mocking, and contract verification without adding a separate manual step to the development workflow. Bonus: Represent Set as an Array In some cases, you may want a Set to be represented as a regular array in the generated OpenAPI specification instead of an array with uniqueItems: true. This can be useful when downstream tools expect a plain array schema (this is the exact request I once got from the frontend team). You can customize Springdoc behavior with a small configuration class: Java import org.springdoc.core.utils.SpringDocUtils; import io.swagger.v3.oas.models.media.Schema; import java.util.Collections; import java.util.Set; public class SwaggerConfig { // Make springdoc generate an Array schema for Set.class // and remove uniqueItems: true public SwaggerConfig() { var schema = new Schema<Set<?>>(); schema.type("array").example(Collections.emptyList()); SpringDocUtils.getConfig().replaceWithSchema(Set.class, schema); } With this adjustment in place, the generated schemas for Set will be emitted as an array, which can simplify integration with some client generators and consumers.

By Roman Dubinin
Smart Deployment Strategies for Modern Applications
Smart Deployment Strategies for Modern Applications

Modern application development has moved toward distributed, cloud-based, and even microservices-based applications, requiring scalability, reliability, and performance under different conditions. Therefore, deployment has become a part of application development, not merely a final activity. Intelligent deployment patterns and practices are all about building applications that are not just easy to deploy, but also reliable, scalable, and efficient in production. This means moving away from traditional, manual deployment patterns and toward automated, container-based deployment practices. Docker and Kubernetes are two prominent technologies that play a vital role in this transformation and shift toward intelligent deployment patterns and practices. Docker helps developers build applications and deploy them along with their dependencies in lightweight, portable containers, overcoming environment consistency problems, while Kubernetes helps deploy, scale, and self-heal these containers. However, without an appropriate strategy, it is possible to introduce unnecessary complexity and even performance issues. Not every application needs Kubernetes, nor does every deployment issue call for a distributed solution. Knowing when to use Docker on its own, when to use Kubernetes, and when to balance performance, cost, and complexity is vital to deliver effective modern applications. This article provides smart deployment strategies using Docker and Kubernetes. It highlights the advantages, disadvantages, and performance of using Docker and Kubernetes. This gives an overview of the deployment strategy. What Docker Does Docker packages your application, all dependencies, and the run time into a small container. Issues Before Docker It works on my machine and is inconsistent in different environments, such as development, test, staging, and productionDependency conflicts – code language version, missing library version, configuration mismatch Docker Benefits Same behavior everywhere – local development environment, production environment, staging environment, etc.Isolation between apps – create each app that has separate containers.Fast startup – light weight versus a virtual machineEasy deployment – just run the container Plain Text Docker start <containername> How Docker Works Plain Text Application Code → Dockerfile → Docker Image → Docker Container → Run application A container image can run on a developer laptop, on virtual machines, in a data center, or in cloud environments with the same packaged runtime and dependencies. So that Docker resolves our packaging issues. But what if the machine has 100 containers? What if one crashes? How to scale during high traffic? How to manage deployments? Docker itself does not solve these problems. Here, we need a deployment strategy; there, we can use Kubernetes. What Kubernetes Does The operational problem of managing the image once it has been created is addressed by Kubernetes, which automates the deployment, scaling, and management of containerized applications, and can even maintain the state of the application by replacing failed containers and rescheduling applications as needed. Kubernetes Benefits Auto scaling: More containers (pods) if traffic increases, and fewer containers if traffic decreases.Self-healing: Starts the container again if it crashes.Load balancing: Spreads the load across the containers.Zero downtime deployment: Updates the system without stopping it.Service management: Manages multiple microservices easily. Docker builds and runs the container. Kubernetes runs the container reliably at scale. For example, in a real-world scenario: Docker = packing lunch boxesKubernetes = managing a large cafeteria serving thousands Plain Text build app → Docker container ↓ Deploy many containers → Kubernetes manages them What a Kubernetes Deployment Actually Does A Kubernetes deployment is a resource in a cluster that manages a group of pods and replica sets for a workload, typically a stateless application. Define the desired state, and the actual state in the cluster moves towards it. Kubernetes also supports rolling updates, where new Pods are created and marked as ready before the old ones are terminated. The typical process for deploying a Spring Boot application to a Kubernetes cluster Develop a Spring Boot application.The Spring Boot application is built and packaged as a Docker image.The Docker image is pushed to a repository.Kubernetes Deployments define the image.Kubernetes creates Pods and exposes them via a Service. Advantages Consistent deployments: Docker provides a standard unit for bundling the application and its run-time dependencies. This minimizes environment drift between development, testing, and production environments. This is one of the biggest advantages of using containers for Java-based Spring Boot applications.Declarative operations: Kubernetes uses a declarative model to manage its deployments. This is a significant advantage because it makes it easy for organizations to implement automation for the deployment of applications.Self-healing: Kubernetes has self-healing features. It can automatically replace failing containers and reschedule the application in case of unavailability. This is a significant advantage because it makes it easy for organizations to implement self-healing for the application.Inbuilt scaling options: Kubernetes provides built-in autoscaling features for the application. This makes it easy for organizations to implement elastic and efficient scaling for the application.Improved service abstraction and traffic routing: A Kubernetes Service is an API object that defines a single service and provides a consistent endpoint. It is then possible to have the system distribute traffic to matching Pods. If access to the service outside the cluster is required, then Ingress or Gateway-based routing is an option.Safer upgrades: It is possible to gradually roll out new versions using rolling updates. This reduces the deployment risk. Disadvantages 1. More Operational Complexity While Docker is simple in itself for small applications, Kubernetes introduces additional complexity, such as pods, deployments, services, ingress, ConfigMaps, secrets, autoscaling, networking policies, etc. While these features can be justified for production environments, they are complex features and must be appreciated for their complexity. Kubernetes documentation is divided into so many sections because of the complexity of the platform, which is multi-functional by design, encompassing features like orchestration, networking, scaling, storage, etc. 2. Higher Resource Overhead Kubernetes introduces operational complexity, which is absent in Docker. This could be a problem for very small applications, as the complexity may outweigh the advantages. This is an assumption based on the complexity of the Kubernetes model compared to the Docker model. 3. Harder Debugging While debugging a Docker application is relatively simple because the application is hosted on a single host, debugging a distributed application is far more complex because of the involvement of multiple hosts, pods, services, etc. This is an assumption based on the complexity of the Kubernetes model compared to the Docker model. 4. Misconfiguration Risk Kubernetes is a powerful technology, but misconfiguration can lead to application failures. Network Policies, for example, are complex features by design, requiring production-level configurations. Performance Considerations Kubernetes doesn’t make your application run faster on its own. Performance still relies on many factors such as application design, JVM tuning, container image quality, database performance, network latency, and resource allocation. However, there are many operational tools provided by Kubernetes for improving performance under varying loads. These tools include autoscaling and rollout features. In general terms, performance considerations can be divided into four categories: Startup performance. Startup performance of a Spring Boot container can be slow, depending on factors such as application size. However, rollout relies on new Pods becoming available for use. Thus, startup performance can impact rollout performance.Runtime efficiency. Containers are much more efficient than traditional deployment models that use many virtual machines. This is why Docker is so popular for container deployment. However, inefficient Docker images or large JVMs can still cause inefficiencies. Docker documentation lists many factors, such as glibc-based or musl-based Docker images.Scaling behavior. Horizontal pod autoscaling is useful when load increases, as it adds more pods to handle it, rather than scaling up resources for existing pods. However, it is critical to note that the application should scale horizontally and not have any bottlenecks at the single-node level.Networking overhead. Kubernetes provides Services, which add abstraction to the network. Although this is helpful for manageability and load balancing, it is critical to note that there should be careful design for every layer in latency-sensitive applications. The abstraction provided by Services is useful for operational purposes, but is not conceptually. Limitations One limitation to be aware of is the fact that Kubernetes deployments are designed for stateless workloads. This means if the application has state tightly coupled with the identity of the instance or has ordered storage, the application may not be the best candidate for a Kubernetes deployment. The Kubernetes documentation itself describes Deployments as typically being used for workloads that “do not maintain state.” Other practical limitations are: Small teams may find Kubernetes too heavy for a simple internal app.Stateful systems still require careful storage, backup, and failover planning.Local development experience can become more complex than plain Docker Compose.Security and networking require active design, not default trust. When/What to use ScenarioNeed DockerNeed Kubernetes Run single app Yes No Microservices Yes Yes Production scale Yes Yes (Mandatory) Auto scaling needed No Yes High Availability No Yes Conclusion The modern deployment model is not just about shipping code; it’s about shipping it reliably and at scale. Docker helps in providing consistency across environments, while Kubernetes helps in providing scale, resilience, and automation. The smart approach in deployment strategy is about selecting the appropriate tool for the job. Docker might be enough for a simple application, but for a complex application with high availability requirements, Kubernetes becomes a must-have. By understanding the strengths and weaknesses of both tools, we can develop efficient, scalable, and sustainable deployment strategies.

By Manju George
Optimizing High-Volume REST APIs Using Redis Caching and Spring Boot (With Load Testing Code)
Optimizing High-Volume REST APIs Using Redis Caching and Spring Boot (With Load Testing Code)

High-volume REST APIs can easily become bottlenecked by database access, leading to high latency and poor throughput. Even after optimizing SQL queries and adding indexes, a database call might take hundreds of milliseconds, still far slower than a competitor’s 50 ms response that leverages caching. In-memory caching offers orders of magnitude faster data access. Traditional databases measure response times in milliseconds, while Redis operations complete in microseconds. By storing frequently accessed data in memory, APIs can handle dramatically more requests per second with much lower latency. As an example, one test showed that using Redis cut an expensive request’s response time from over 10 seconds down to under 1 second. Setting Up Redis Caching in Spring Boot Before diving into patterns, let’s ensure the basic setup is in place. We assume you have a local Redis server running. In your Spring Boot project, include the necessary dependencies for caching and Redis integration. For example, add the following to your Maven pom.xml: XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-cache</artifactId> <version>3.1.5</version> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-redis</artifactId> <version>3.1.5</version> </dependency> These bring in Spring’s generic caching support and the Redis connector. Next, enable caching in your application by annotating a configuration or main class with @EnableCaching. Spring Boot will auto-configure a RedisCacheManager if it finds Redis on the classpath. You can then define cache settings via configuration. For example, you might set a default time to live for cache entries in application.properties or via a RedisCacheConfiguration bean. A simple property-based configuration for a local Redis could be: Properties files spring.cache.type=redis spring.redis.host=localhost spring.redis.port=6379 spring.cache.redis.time-to-live=600000 # 600000 ms = 10 minutes TTL Now we have a basic cache setup. Let’s explore caching patterns and how to implement them in Spring Boot. Write-Through and Write-Behind Caching Caching isn’t just for reads; we also need a strategy for writes. Write-through and write-behind are patterns to handle data modifications in a cached system: Write-Through On every data write, the application synchronously writes to the database and the cache. This ensures the cache is always up-to-date with the latest data. In practice, a write-through approach might perform the database operation, then immediately update the Redis cache with the new value. Spring’s caching abstraction can support this via annotations like @CachePut or by combining a normal save method with a manual cache update. For example, in a product service, we might do: Java @CachePut(value = "products", key = "#product.id") public Product updateProduct(Product product) { // Save to DB first Product saved = repo.save(product); return saved; // Spring will put this return value into "products::[id]" cache } This method will update the database and also put the new product data into the cache under the given key. The next read for that product can be served from cache immediately, with no stale data. If we delete an item, we can use @CacheEvict to remove it from the cache at the same time as removing it from the DB, preventing ghost entries. Write-Behind (Write-Back) In this less common strategy, the application writes to the cache first and defers the database write till later. The idea is to batch or coalesce many writes to reduce DB pressure. Avoiding Cache Stampede (Thundering Herd) When caching for high-volume traffic, cache stampedes are a serious concern. A stampede occurs when a cache entry expires or is missing, and many concurrent requests attempt to fetch the same data from the database at once. In a high QPS system, this can overwhelm the database and essentially negate the benefit of caching. We need strategies to prevent dozens or hundreds of threads from piling onto the DB when a popular item cache invalidates. One common solution is to use locking or synchronization around cache misses. The idea is to ensure only one thread does the expensive database fetch and populates the cache, while the others wait or get served a stale value. In a single-instance application, you might synchronize on a Java lock per key. In a distributed environment, you’ll want a distributed lock. Redis itself can be used to implement this. For our Spring Boot application, we could integrate Redisson and use it in the service method. For instance: Java RLock lock = redissonClient.getLock("lock:product:" + productId); boolean acquired = lock.tryLock(5, 10, TimeUnit.SECONDS); // wait up to 5s to acquire, auto-release after 10s if (acquired) { try { // Double-check cache after acquiring lock Product cached = redisTemplate.opsForValue().get(cacheKey); if (cached != null) { return cached; } // Cache still empty, fetch from DB and update cache Product dbData = repo.findById(productId); redisTemplate.opsForValue().set(cacheKey, dbData, Duration.ofMinutes(10)); return dbData; } finally { lock.unlock(); } } else { // Could not acquire lock (timed out) – fallback to a stale cache or return an error ... } In the above pseudocode, multiple threads hitting a missing cache key will attempt to tryLock. One will succeed and do the DB query, while others wait up to 5 seconds. Once the first thread populates the cache and releases the lock, the others will find the data in the cache and avoid hitting the DB. This approach effectively serializes the cache miss for a given key, preventing a herd of concurrent DB calls. It’s a bit heavy, so you might not use it for every key; typically, you'll use it for very hot items or expensive queries that you know could trigger stampedes. Simpler techniques can also mitigate stampedes, like cache early recomputation or using slightly randomized TTLs so not everything expires at the same time. Load Testing the Impact of Caching With JMeter After implementing Redis caching, it’s critical to verify the performance improvements under realistic load. Apache JMeter is a popular tool for simulating concurrent users and measuring response times and throughput of your API. We can use JMeter to compare the API’s behavior with and without cache and ensure that our caching does indeed handle high volume as expected. For example, suppose we want to test an endpoint /products/{id} which we’ve optimized with caching. We can create a JMeter test plan with a Thread Group of, say, 100 threads and loop them to send requests for various product IDs. JMeter will report metrics like average response time, throughput, error rate, etc. In a baseline test, you might observe higher latencies and lower throughput. Then, in a test with the cache warmed (most requests hitting the cache), you should see a dramatic reduction in response time and the ability to handle more requests per second. In one real-world inspired demo, using Redis caching improved latency from 10 seconds on a cold miss to under 1 second on subsequent hits. Another way to look at it: memory caching can serve data so fast that your throughput might be an order of magnitude higher than relying solely on the DB. This aligns with the earlier statement that no amount of DB tuning beats data served from an in-memory cache. Using JMeter Set up JMeter (you can run it in GUI mode to design the test plan, and then use non-GUI mode for the actual high-load run for better accuracy). Configure an HTTP Request sampler pointing at your API (e.g., GET http://localhost:8080/products/1234). Use a Thread Group to simulate the desired number of concurrent users and iterations. You can add a Timer if you want a delay between requests, or just hammer the API as fast as possible to find its max throughput. Add listeners like Summary Report or Aggregate Report to gather results. To automate performance testing, you can even integrate JMeter with your build. A Maven plugin exists to run JMeter tests as part of a build pipeline. JMeter Configuration Snippet Suppose we want to quickly run a load test from the command line (non-GUI). We could use a command like: Shell jmeter -n -t path/to/testplan.jmx -l results.jtl -Jthreads=100 -Jduration=60 This would run the JMeter test plan for 60 seconds with 100 threads, logging results to results.jtl. Make sure to monitor your system while testing, especially if everything is on the same machine; the load test could itself become a bottleneck or interfere with results if not planned carefully. As a quick check, you can also use Spring Boot Actuator metrics or Redis monitoring to see cache hit rates. A healthy caching layer under load should show a high cache hit percentage, which correlates with lower DB usage and faster responses. Conclusion Optimizing a high-volume REST API often requires rethinking data access patterns, and Redis caching is a powerful technique to achieve massive performance gains. By using the cache-aside pattern, we serve most reads from fast in-memory storage, drastically reducing latency and database load. With write-through strategies and careful cache invalidation, we keep cached data consistent with the source of truth. It’s equally important to anticipate real-world issues like cache stampedes using locks or other techniques to prevent cache misses from overwhelming your database in a traffic surge. Finally, always test under load. Use tools like JMeter to simulate concurrent access and measure the impact of your caching. You should observe significant improvements in throughput and response times, validating that the cache is doing its job. If the results aren’t as expected, that’s an indication to refine your caching strategy or investigate bottlenecks.

By Mallikharjuna Manepalli
Spring CRUD Generator v1.1.0 Updates
Spring CRUD Generator v1.1.0 Updates

I’ve just released Spring CRUD Generator v1.1.0 — an open-source generator that helps you bootstrap a Spring Boot CRUD backend from a single YAML specification. If you’ve built more than a couple of CRUD-heavy services, you’ve probably experienced the same pain points: repeating the same layers (entity, repository, service, controller), keeping consistent naming and structure across modules, and constantly adjusting boilerplate when requirements change. Spring CRUD Generator aims to reduce that overhead by letting you define your data model and project options once (in YAML) and generate a consistent project structure around it. This release adds field-level validation, improves Redis caching, and fixes compatibility issues so the generator works reliably with Spring Boot 3 and Spring Boot 4. It also improves behavior when Open Session In View (OSIV) is disabled by adding EntityGraph support in generated resources. What’s New in v1.1.0 1. Field Validation (fields.validation) The headline feature in v1.1.0 is a new optional “validation” section inside each entity field definition. Instead of sprinkling validation rules manually throughout your DTOs and controllers after generation, you can now describe typical constraints directly in the YAML config and have the generator produce the appropriate validation-aware output. This is useful for teams that want a single source of truth for model constraints. A YAML-driven validation model also makes it easier to review and evolve constraints alongside schema changes. For example, you can express “this field must be required” or “string length must be within a range” directly where the field is defined. A notable addition in this release is support for regex-based validation via “pattern.” That’s a practical constraint for fields like passwords, identifiers, or custom-formatted strings. It’s worth mentioning that the validation section is optional: if you don’t need it, you don’t have to add it, and your existing specs remain valid. 2. Cache and Redis Improvements Caching is a common performance layer in CRUD systems, but it becomes tricky when you combine Redis serialization and Hibernate’s lazy-loaded associations. This release includes two important caching-related improvements: The generator previously produced incorrect values for @Cacheable(value=...) in certain cases. That has been fixed, ensuring that cache names/values are generated consistently and correctly.Cache configuration has been updated to include HibernateLazyNullModule. This improves Redis caching behavior when Hibernate lazy-loaded entities are involved. In practice, it reduces the likelihood of serialization issues (or unexpected failures) when caching objects that contain lazily-loaded properties. If your generator output is used in services that rely on Redis for caching, this update should make caching more stable and predictable. 3. Spring Boot 3 and 4 Compatibility Another big part of this release is compatibility. The generator is now fully compatible with both Spring Boot 3 and Spring Boot 4. In ecosystems like Spring, small changes between major versions can break builds, plugins, or code generation assumptions. Ensuring compatibility across versions is essential for teams that want to upgrade gradually (or maintain multiple services on different baselines). v1.1.0 addresses compatibility issues so you can use the generator reliably on either Spring Boot line. 4. OSIV Control + EntityGraph Support Open Session In View (OSIV) is often considered an anti-pattern because it can hide N+1 query problems and produce unpredictable lazy-loading behavior in higher layers. Many teams disable it intentionally. In v1.1.0, the generator introduces a new configuration entry: additionalProperties.spring.jpa.open-in-view (default: false) With OSIV disabled by default, the generated resources now include EntityGraph support to handle lazy relations more safely. The goal here is practical: avoid surprising LazyInitializationException scenarios without relying on OSIV. This approach also nudges generated projects toward better data-fetching discipline. If your project still relies on OSIV, you can explicitly set it to true in the configuration. But the default aligns with a more defensive and production-friendly setup. 5. Improvements and Stability Beyond features, v1.1.0 includes a set of improvements that make the project easier to maintain and more robust: Documentation has been updated to reflect the new fields.validation structure.Tests have been updated and fixed so they work correctly with the new validation model.The generator internals were refactored to improve readability and maintainability.A validation edge case was fixed: when min/max are not provided, the generator no longer throws a NullPointerException (NPE). This prevents configuration mistakes from turning into runtime failures during generation. There is also an updated “full CRUD spec YAML” reference example in the repository. It existed before, but it has now been refreshed to include the new fields.validation configuration and the additionalProperties.spring.jpa.open-in-view setting. If you want to see the complete configuration surface area (not just a short snippet), that reference YAML is the best place to start. Updated CRUD Spec YAML (Short Example) The repository includes an updated full CRUD spec YAML reference example. Below is a shortened snippet that highlights the new and most important parts (validation + OSIV): YAML configuration: database: postgresql javaVersion: 21 springBootVersion: 4 cache: enabled: true type: REDIS expiration: 5 openApi: apiSpec: true additionalProperties: rest.basePath: /api/v1 spring.jpa.open-in-view: false entities: - name: ProductModel storageName: product_table fields: - name: id type: Long id: strategy: IDENTITY - name: name type: String column: nullable: false unique: true length: 10000 validation: required: true notBlank: true minLength: 10 maxLength: 10000 - name: price type: Integer column: nullable: false validation: required: true min: 1 max: 100 - name: users type: UserEntity relation: type: OneToMany fetch: LAZY joinColumn: product_id validation: required: true minItems: 1 maxItems: 10 - name: UserEntity storageName: user_table fields: - name: id type: Long id: strategy: IDENTITY - name: email type: String validation: required: true email: true - name: password type: String validation: required: true pattern: "^(?=.*[A-Za-z])(?=.*\\d)[A-Za-z\\d]{8,}$" Tip: Check the repository’s full CRUD spec YAML example to see the complete supported configuration surface Upgrade Notes fields.validation is optional — add it only where needed.spring.jpa.open-in-view defaults to false. If your project relies on OSIV behavior, explicitly set it to true. If you build Spring Boot services frequently and want to reduce repetitive CRUD boilerplate, feel free to try the generator and share feedback. If you find the project useful, I’d really appreciate a star on the repository — it helps a lot and keeps the momentum going. Repository: https://github.com/mzivkovicdev/spring-crud-generator. Development continues, and more improvements are on the way. Thanks for the support!

By Marko Zivkovic
Swift Concurrency Part 4: Actors, Executors, and Reentrancy
Swift Concurrency Part 4: Actors, Executors, and Reentrancy

In this article, we will dive deep into actors, nonisolated methods, @MainActor and @GlobalActors, and the concept of actor reentrancy. We will also explore what happens behind the scenes in the Swift concurrency runtime, including jobs, executors, workers, and schedulers, so you can understand not just how to use these tools, but why they work the way they do. Whether you’re already using Swift’s async/await features or just starting to explore concurrency, this guide will give you a solid understanding of the mechanisms that keep your concurrent code safe and efficient. Actors and Isolation in Swift Concurrency If you’ve spent years working with Grand Central Dispatch (GCD), you already know the core problem: shared mutable state. When multiple threads can read and write the same data at the same time, you risk data races: inconsistent reads, lost updates, or crashes that only appear under heavy load. With GCD, we relied on discipline using serial queues or locks. But discipline fails. One forgotten .sync call and your correctness vanishes. Swift concurrency introduces Actors to make data-race freedom a language-level guarantee. Class vs. Struct vs. Actor Type Semantics Thread Safety Mutation Model Struct Value By-copy safe Explicit mutating Class Reference Unsafe by default Shared mutable state Actor Reference Data-race safe Serialized access Actors sit exactly where classes used to be, but with correctness guarantees. Actor Basics An actor is a reference type that protects its mutable state through isolation. Unlike a class, you cannot accidentally touch an actor’s internal state from multiple threads. Swift actor BankStore { private var balance: Int = 0 func deposit(_ amount: Int) { balance += amount } func withdraw(_ amount: Int) -> Bool { guard balance >= amount else { return false } balance -= amount return true } Key properties of actors: Reference semanticsOnly one task at a time can access actor-isolated stateExternal access requires await nonisolated: Opting Out of Isolation Sometimes you need functionality that doesn’t touch the actor’s state or needs to be callable synchronously. Use the nonisolated keyword for these “pure” utilities. Swift actor ImageCache { nonisolated static let maxItems = 100 nonisolated func cacheKey(for url: URL) -> String { url.absoluteString } } Rule of thumb: if it reads or writes actor state - it should not be nonisolated. The Actor Model: The Mailbox Mental Model Think of an actor as having a mailbox: Each actor has a queue of pending work.Messages (calls) are enqueued as tasks.The actor processes these one at a time. When you write await store.deposit(50), you aren’t calling a function in the traditional sense. You are sending a message to the actor and suspending your current thread until the actor finishes processing that message. This is why await is mandatory: the actor might be busy with someone else’s request. Working With @MainActor and Other @GlobalActors When building scalable iOS applications, managing shared state across isolated domains like UI components, network layers, and local caches becomes a complex puzzle. Swift simplifies this with @GlobalActor. A global actor is essentially a singleton actor. It allows you to isolate state and operations globally without needing to pass an actor reference around your entire dependency graph. The most famous of these is, of course, the @MainActor. The @MainActor is uniquely tied to the main thread. Anything marked with this attribute is guaranteed to execute on the main thread, making it the bedrock for all UI updates. Swift @MainActor final class FlashcardViewModel: ObservableObject { @Published var currentCard: Card? func loadNextCard() async { // Safe to update UI state directly; we are isolated to the MainActor. self.currentCard = await fetchCard() } } However, the power of global actors isn’t limited to the main thread. You can define your own global actors to serialize access to highly contested shared resources, such as a centralized local database or an aggressive retry policy manager. Swift @globalActor public actor SyncActor { public static let shared = SyncActor() } @SyncActor final class OfflineSyncManager { var pendingMutations: [Mutation] = [] func queue(mutation: Mutation) { pendingMutations.append(mutation) } } By annotating OfflineSyncManager with @SyncActor, you guarantee that all accesses to pendingMutations are serialized on that specific actor’s executor, completely eliminating data races from different parts of your app trying to queue offline changes simultaneously. Actor Reentrancy Explained If you’re coming from the world of Grand Central Dispatch (GCD) and DispatchQueue, actors require a fundamental mental shift. A serial dispatch queue executes tasks strictly one after another. If a task is running, nothing else can run on that queue until it finishes. Swift actors are different: they are reentrant. Reentrancy means that while an actor guarantees mutual exclusion for synchronous code execution (only one thread can be inside the actor at a time), it explicitly allows other tasks to interleave at suspension points. When an actor encounters an await, it suspends the current task. Crucially, it also gives up its lock on the executor. During this suspension, the actor is completely free to pick up and execute other pending tasks. Once the awaited operation finishes, the original task is scheduled to resume on the actor when it’s free again. This design prevents deadlocks. If actors weren’t reentrant, two actors awaiting each other would instantly freeze your application. However, reentrancy introduces its own subtle class of concurrency bugs. The Hidden Risks of Suspending Inside Actor Methods Because the actor unblocks during an await, the state of your actor before the await might not match the state after the await. This is the single biggest trap engineers fall into when adopting Swift concurrency. Imagine implementing a session manager that fetches a fresh authentication token. If multiple requests fail and trigger a token refresh simultaneously, you might accidentally fire off multiple network requests if you don’t account for reentrancy. Swift actor SessionManager { private var cachedToken: String? func getValidToken() async throws -> String { // 1. Check local state if let token = cachedToken { return token } // 2. Suspend! The actor is now free to process other calls to `getValidToken()` let freshToken = try await performNetworkRefresh() // 3. State mutation. // DANGER: If another task interleaved during step 2, we might overwrite a valid token, // or we just unnecessarily performed multiple network requests. self.cachedToken = freshToken return freshToken } } To protect against this, you must rethink how you handle in-flight asynchronous operations. Instead of caching just the result, you often need to cache the Task itself. Swift actor SessionManager { private var cachedToken: String? private var refreshTask: Task<String, Error>? func getValidToken() async throws -> String { if let token = cachedToken { return token } // Return the in-flight task if one exists if let existingTask = refreshTask { return try await existingTask.value } // Otherwise, create a new task and cache IT immediately let task = Task { let freshToken = try await performNetworkRefresh() self.cachedToken = freshToken self.refreshTask = nil // Clean up return freshToken } self.refreshTask = task return try await task.value } } Always remember: across an await, your actor’s state is completely unguarded. Inside the Swift Concurrency Runtime To truly master structured concurrency, we need to step out of the syntax and into the engine room. Swift’s concurrency model isn’t just syntactic sugar over GCD; it is a completely bespoke, highly optimized runtime built around a cooperative thread pool. Understanding Jobs In the Swift runtime, a Job is the fundamental unit of schedulable work. When you write an async function, the compiler breaks your function down into partial tasks or “continuations” split at every await keyword. Each of these segments is wrapped into a Job. When a task suspends, the current Job finishes. When the awaited result is ready, a new Job is enqueued to resume the remainder of the function. Jobs are lightweight, heavily optimized, and managed entirely by the Swift runtime. How Executors Work If Jobs are the work, Executors are the environments where the work is allowed to happen. An executor defines the execution semantics for a set of Jobs. Every actor has a serial executor. This executor acts as a funnel, ensuring that only one Job associated with that actor runs at any given microsecond. When you call an actor method, you are submitting a Job to that actor’s executor. Custom Serial Executors (Actor Level) In the first example, we create a MainQueueExecutor conforming to SerialExecutor. This is particularly useful when you have a legacy codebase heavily dependent on a specific DispatchQueue and you want to wrap that logic into a modern Actor. Swift final class MainQueueExecutor: SerialExecutor { func enqueue(_ job: consuming ExecutorJob) { let unownedJob = UnownedJob(job) let unownedExecutor = asUnownedSerialExecutor() DispatchQueue.main.async { unownedJob.runSynchronously(on: unownedExecutor) } } func asUnownedSerialExecutor() -> UnownedSerialExecutor { UnownedSerialExecutor(ordinary: self) } } @globalActor actor CustomGlobalActor: GlobalActor { static let sharedUnownedExecutor = MainQueueExecutor() static let shared = CustomGlobalActor() nonisolated var unownedExecutor: UnownedSerialExecutor { Self.sharedUnownedExecutor.asUnownedSerialExecutor() } } Task Executors (Task Level) While a SerialExecutor protects an actor’s state, a TaskExecutor influences the “ambient” environment where a task and its children run. It doesn’t provide serial isolation; it provides a preferred execution location. Swift final class MainQueueExecutor: TaskExecutor { func enqueue(_ job: consuming ExecutorJob) { let unownedJob = UnownedJob(job) self.enqueue(unownedJob) } func enqueue(_ job: UnownedJob) { let unownedExecutor = asUnownedTaskExecutor() DispatchQueue.main.async { job.runSynchronously(on: unownedExecutor) } } func asUnownedTaskExecutor() -> UnownedTaskExecutor { UnownedTaskExecutor(ordinary: self) } } let executor = MainQueueExecutor() Task.detached(executorPreference: executor) { // TODO: Perform an async operation } What Workers Do Executors don’t magically run code; they need CPU threads. This is where Workers come in. In Swift concurrency, there is a global, cooperative thread pool. The threads inside this pool are the “workers.” Unlike GCD, which can spawn hundreds of threads, leading to thread explosion and massive memory overhead, the Swift thread pool is strictly limited, generally to the number of active CPU cores. However, this isn’t a hard-and-fast rule; there are specific cases where the pool may spawn more threads. We took a deep dive into this behavior in the article Swift Concurrency: Part 1. Workers ask executors for Jobs. When a worker thread picks up a Job from an executor, it executes it until completion or suspension. Because the number of workers is limited, Swift enforces a strict rule: you must never use blocking APIs (like semaphores or synchronous network calls) inside an async context. If you block a worker thread, you are permanently stealing a core from the concurrency runtime. The Role of Schedulers The Scheduler is the invisible conductor orchestrating this entire process. It decides which Jobs sit on which Executors, and which Workers get assigned to process them. The scheduler is highly priority-aware. When you spawn a Task(priority: .userInitiated), the scheduler ensures the resulting job jumps ahead of background jobs in the queue. It handles the complex logic of priority inversion avoidance, waking up worker threads, and balancing the load across the CPU. Types of Executors and How They’re Chosen Swift utilizes different types of executors depending on the context of your code: The global concurrent executor: If your code is not isolated to any actor (e.g., a detached task or a standalone async function), it runs on the default global concurrent executor. This executor distributes Jobs freely across all available workers in the cooperative thread pool.The main actor executor: This is a specialized serial executor permanently bound to the application’s main thread. The scheduler ensures that any Job submitted here is handed off to the main runloop.Default serial executors: Every standard actor you create gets its own default serial executor. The runtime dynamically maps this executor to any available worker thread in the pool as needed.Custom executors (Swift 5.9+): Advanced use cases might require overriding how an actor executes its jobs. By implementing the SerialExecutor protocol, you can create custom executors, for instance, to force an actor to run its jobs on a specific, legacy DispatchQueue to interoperate with older C++ or Objective-C codebases seamlessly. How the Runtime Chooses an Executor Understanding that executors exist is one thing; predicting exactly where your code will run is another. When a Job is ready to execute, the Swift runtime evaluates a precise decision tree to route that workload. Here is the exact algorithm the runtime uses to select an executor: Is the method isolated? (i.e., is it bound to a specific actor?) No (Non-isolated): Is there a preferred Task executor? Yes: The task executes on the Preferred Task Executor.No: The task executes on the standard Global Concurrent Executor.Yes (Actor-isolated): Does the actor provide its own custom executor? Yes: The task executes strictly on the Actor’s Custom Executor.No: Does the current Task have a preferred executor? Yes: The task executes on the Preferred Task Executor (while still strictly upholding the actor’s serial isolation).No: The task executes on the Default Actor Executor. This cascading logic ensures that actors maintain their state safety while allowing developers to influence the underlying execution environment when necessary. Inspecting Your Context: The #isolation Macro When dealing with deep call stacks and complex async boundaries, you might lose track of your current execution context. Swift 5.10 introduced a brilliant diagnostic tool to solve this: the #isolation macro. This macro evaluates at compile time to capture the actor isolation of the current context. It returns an any Actor? representing the actor you are currently isolated to, or nil if you are executing concurrently. Swift func debugCurrentContext() { // Prints the instance of the actor (like MainActor), or "no isolation" print(#isolation ?? "no isolation") } Sprinkling this into your logging infrastructure is invaluable when debugging data races or verifying that a heavy computation isn’t accidentally blocking the @MainActor. Task Executors vs. Actor Executors With recent advancements in Swift Evolution (specifically SE-0417 and SE-0392), developers now have the unprecedented ability to provide custom executors. However, to wield this power safely, you must deeply understand the difference between the two primary executor protocols: TaskExecutor and ActorExecutor (via SerialExecutor). What is a Task Executor? A Task Executor governs the execution environment for a specific Task hierarchy. Crucially, a Task Executor is inherently concurrent. It represents a thread pool or a concurrent queue where multiple jobs can be processed simultaneously. When you assign a preferred Task Executor, you are telling the runtime, “Unless an actor says otherwise, run the asynchronous work for this task pool over here.” What is an Actor Executor? An Actor Executor (which conforms to the SerialExecutor protocol) governs the execution environment for a specific actor instance. Unlike a Task Executor, an Actor Executor is strictly serial. It processes one job at a time, enforcing the mutual exclusion that makes actors safe from data races. The Danger of Custom Implementations Understanding the concurrent nature of Task Executors and the serial nature of Actor Executors is not just trivia, it is a strict runtime invariant. If you decide to write a custom executor (for example, wrapping an old C++ thread pool or a specific Grand Central Dispatch queue), you carry the burden of upholding these invariants: If you implement a SerialExecutor for an actor, but your underlying implementation accidentally allows concurrent execution, you will break the actor’s state isolation and introduce impossible-to-reproduce data races.Conversely, if you implement a TaskExecutor but back it with a serial queue, you risk starving the cooperative thread pool and introducing unexpected deadlocks across your async task hierarchies. The compiler trusts you to maintain these semantic guarantees. If you break them, the concurrency model shatters. Conclusion Swift concurrency is more than syntactic sugar for asynchronous code. It is a carefully designed execution model that formalizes how work is scheduled, isolated, and resumed. Actors provide safety guarantees, but understanding reentrancy and executor behavior is what allows engineers to reason about concurrency with confidence. By understanding these low-level mechanics when an actor temporarily releases isolation and how the runtime schedules jobs across worker threads, you can build iOS applications that are not only performant but also resilient to the subtle concurrency bugs that once plagued asynchronous systems.

By Nikita Vasilev
Building an Image Classification Pipeline With Apache Camel and Deep Java Library (DJL)
Building an Image Classification Pipeline With Apache Camel and Deep Java Library (DJL)

Image classification is now a key part of many applications. Whether you’re automating photo organization, filtering uploaded content, or enriching product catalogs with visual tags, knowing what’s in an image can be just as important as knowing what a user typed. For Java developers, the challenge is familiar: most computer vision examples live in Python notebooks, while the systems that actually need image classification run on the JVM. Bridging that gap usually means standing up a separate Python microservice, managing REST calls, and dealing with serialization overhead. That’s a lot of ceremony for what should be a single processing step. This tutorial will show you how to build an image classification pipeline in pure Java with Apache Camel and the Deep Java Library (DJL). We’ll cover watching folders for new images, running classification with a pre-trained ResNet model, tidying up the predictions into clean reports, and routing results to output files, all while leaning on those trusty Enterprise Integration Patterns you’re probably already familiar with. What You'll Learn By the time you’re done here, you’ll be comfortable with: Develop a file-based image classification pipeline using Apache Camel.Use a pre-trained ResNet image classification model via Camel’s DJL component.Understand the djl: URI syntax and model configuration for computer vision tasks in Apache Camel.Structure routes with content-based routing and multiple formatter beans.Run image classification locally using Java and Apache Camel, without external APIs or Python services. Frameworks Used Apache Camel Apache Camel is an awesome open-source integration framework built on Enterprise Integration Patterns. It has great components for connecting systems, moving data, and orchestrating workflows using declarative routes. In this project, we look at file ingestion, message transformation, content-based routing, bean integration, error handling, and output persistence. Deep Java Library (DJL) DJL is a deep learning framework for Java that is engine-agnostic. It provides a high-level API for inference, training, and serving deep learning models right on the JVM. We use the Camel-DJL component to load a pre-trained ResNet model from the DJL Model Zoo, run image classification inference inside the JVM, and return structured classification results. ResNet for Image Classification Residual Network (ResNet) is a deep convolutional neural network architecture that introduced skip connections to solve the vanishing gradient problem. The model we use here is pre-trained on the ImageNet dataset, which covers 1,000+ categories — animals, vehicles, everyday objects, food items, you name it. It strikes a nice balance between accuracy and inference speed for CPU-based classification. Project Structure Let's look at the project structure below: reStructuredText camel-image-classifier/ ├── src/main/java/com/example/imageclassifier/ │ ├── MainApp.java # Application entry point │ ├── routes/ │ │ └── ImageClassificationRoutes.java # Camel route for image processing │ └── processor/ │ ├── ClassificationsFormatter.java # Formats DJL Classifications output │ ├── MapResultsFormatter.java # Formats Map-based results │ └── FallbackFormatter.java # Handles unexpected outputs ├── src/main/resources/ │ └── application.properties # Camel configuration ├── data/ │ ├── input/ # Drop JPEG images here │ ├── output/ # Classification results (text files) │ └── classified/ # Processed images archive ├── gradle/wrapper/ # Gradle wrapper files ├── build.gradle # Project dependencies ├── settings.gradle # Gradle settings ├── gradlew.bat # Gradle wrapper script ├── README.md # Main documentation Gradle Dependencies build.gradle Groovy plugins { id 'java' id 'application' } group = 'com.example' version = '1.0.0' description = 'Image Classification with Apache Camel and DJL' java { toolchain { languageVersion = JavaLanguageVersion.of(21) } } application { mainClass = 'com.example.imageclassifier.MainApp' } repositories { mavenCentral() } dependencies { // Apache Camel implementation 'org.apache.camel:camel-core:4.4.0' implementation 'org.apache.camel:camel-main:4.4.0' implementation 'org.apache.camel:camel-file:4.4.0' implementation 'org.apache.camel:camel-djl:4.4.0' // DJL (Deep Java Library) for image classification implementation platform('ai.djl:bom:0.28.0') implementation 'ai.djl:api' // MXNet engine for image classification (used by Camel DJL component) implementation 'ai.djl.mxnet:mxnet-engine' implementation 'ai.djl.mxnet:mxnet-model-zoo' // Use CPU-only MXNet runtime for Windows runtimeOnly 'ai.djl.mxnet:mxnet-native-mkl:1.9.1:win-x86_64' // Logging implementation 'org.slf4j:slf4j-simple:2.0.9' } A few things to note here compared to a typical NLP setup. For image classification, we use the MXNet engine instead of PyTorch. MXNet’s model zoo ships with a well-tested ResNet model optimized for image classification, and the mxnet-native-mkl dependency gives you CPU-optimized native libraries via Intel MKL. The DJL BOM makes sure the versions are consistent across engines and models. Application Entry Point The application starts up using the MainApp class and starts Camel using Main: Java package com.example.imageclassifier; import com.example.imageclassifier.routes.ImageClassificationRoutes; import org.apache.camel.main.Main; public class MainApp { public static void main(String[] args) throws Exception { System.out.println("================================================="); System.out.println("Image Classification with Apache Camel and DJL"); System.out.println("================================================="); // Create and configure Camel Main Main main = new Main(); // Add routes main.configure().addRoutesBuilder(new ImageClassificationRoutes()); // Start Camel System.out.println("\nStarting Apache Camel..."); System.out.println("Watching folder: data/input"); System.out.println("Output folder: data/output"); System.out.println("Press Ctrl+C to stop\n"); main.run(); } } Image Classification Route The ImageClassificationRoutes.java is where the core logic is implemented using the Camel DJL component’s URI. It uses the “from” component for image ingestion (watches for JPEG files, processes them one at a time to extract the raw bytes, archives them with a timestamp), and uses a single “to” URI endpoint to run image classification using the DJL component URI. The route then dispatches to the right formatter using Camel’s content-based routing. ImageClassificationRoutes.java Java package com.example.imageclassifier.routes; import com.example.imageclassifier.processor.ClassificationsFormatter; import com.example.imageclassifier.processor.FallbackFormatter; import com.example.imageclassifier.processor.MapResultsFormatter; import org.apache.camel.builder.RouteBuilder; import java.io.File; import java.nio.file.Files; /** * Apache Camel routes for image classification. */ public class ImageClassificationRoutes extends RouteBuilder { @Override public void configure() throws Exception { // Route to process JPEG images from input folder from("file:data/input?include=.*\\.(jpg|jpeg|JPG|JPEG)&noop=false&move=../classified/${date:now:yyyyMMdd-HHmmss}-${file:name}") .routeId("image-classification-route") .log("Processing image: ${file:name}") // Read file into bytes so the DJL component can create an Image internally. .process(exchange -> { File imageFile = exchange.getIn().getBody(File.class); exchange.getIn().setBody(Files.readAllBytes(imageFile.toPath())); }) // Run inference via Camel DJL component. .to("djl:cv/image_classification?artifactId=ai.djl.mxnet:resnet:0.0.1") // Convert output to a text report using Camel choice/bean components. .choice() .when(body().isInstanceOf(ai.djl.modality.Classifications.class)) .bean(new ClassificationsFormatter(), "format") .when(body().isInstanceOf(java.util.Map.class)) .bean(new MapResultsFormatter(), "format") .otherwise() .bean(new FallbackFormatter(), "format") .end() .log("Inference done for ${file:name}") // Write results to output folder .to("file:data/output?fileName=${date:now:yyyyMMdd-HHmmss}-${file:name.noext}.txt") .log("Results saved to output folder"); } } Let’s break this down: Stage 1: File Ingestion Java from("file:data/input?include=.*\\.(jpg|jpeg|JPG|JPEG)&noop=false&move=../classified/${date:now:yyyyMMdd-HHmmss}-${file:name}") The from component watches the data/input/ folder for JPEG files. The regex pattern include=.*\\.(jpg|jpeg|JPG|JPEG) makes sure only image files get picked up. Once processed, each image is moved to data/classified/ with a timestamp prefix, which prevents reprocessing and provides a clean audit trail. Setting noop=false means the file is consumed (moved), not left in place. Stage 2: Image to Bytes Java .process(exchange -> { File imageFile = exchange.getIn().getBody(File.class); exchange.getIn().setBody(Files.readAllBytes(imageFile.toPath())); }) The DJL component expects the image as a byte[] so it can construct a DJL Image object internally. This inline processor reads the file into a byte array and replaces the message body with it. It’s a small but essential step; without it, the DJL component would receive a File reference instead of raw pixel data. Stage 3: DJL Inference Java .to("djl:cv/image_classification?artifactId=ai.djl.mxnet:resnet:0.0.1") This single line is the heart of the pipeline. Let’s unpack the URI: djl – The Camel DJL componentcv/image_classification – The computer vision task type (as opposed to nlp/sentiment_analysis used in NLP tasks)artifactId=ai.djl.mxnet:resnet:0.0.1 – Identifies the pre-trained ResNet model from DJL’s MXNet Model Zoo This single line replaces what would otherwise be hundreds of lines of model loading, image preprocessing, tensor conversion, and inference code. Stage 4: Content-Based Routing Java .choice() .when(body().isInstanceOf(ai.djl.modality.Classifications.class)) .bean(new ClassificationsFormatter(), "format") .when(body().isInstanceOf(java.util.Map.class)) .bean(new MapResultsFormatter(), "format") .otherwise() .bean(new FallbackFormatter(), "format") .end() Here’s something you’ll run into with image classification that you won’t see in the sentiment analysis setup: the DJL component can return different types depending on the engine and model version. Most of the time, you get a Classifications object, but some MXNet model variants hand back a Map<String, Float> instead. Rather than assuming one type and risking a ClassCastException in production, we use Camel’s Content-Based Router pattern to dispatch to the right formatter bean. The FallbackFormatter catches anything unexpected — so the pipeline never crashes silently. This is a classic Enterprise Integration Pattern, and it’s one of the biggest advantages of using Camel for ML pipelines. The routing logic is declarative, testable, and easy to extend. Formatter Beans ClassificationsFormatter.java This is the primary formatter, handling the standard Classifications output from DJL: Java package com.example.imageclassifier.processor; import ai.djl.modality.Classifications; import org.apache.camel.Exchange; import java.util.List; /** * Bean to format DJL Classifications object into a text report. */ public class ClassificationsFormatter { public String format(Classifications classifications, Exchange exchange) { StringBuilder sb = new StringBuilder(); String fileName = exchange.getIn().getHeader("CamelFileName", String.class); sb.append("Image: ").append(fileName).append('\n'); List<Classifications.Classification> topK = classifications.topK(5); if (!topK.isEmpty()) { Classifications.Classification top = topK.get(0); sb.append("Top Prediction: ").append(top.getClassName()) .append(" (Confidence: ").append(String.format("%.2f%%", top.getProbability() * 100)) .append(")\n\n"); } sb.append("Top 5 predictions:\n"); for (int i = 0; i < topK.size(); i++) { Classifications.Classification c = topK.get(i); sb.append(String.format("%d. %s: %.2f%%\n", i + 1, c.getClassName(), c.getProbability() * 100)); } return sb.toString(); } } The topK(5) call extracts the five most confident predictions. Each classification carries a class name (e.g., “golden retriever”) and a probability score. The formatter produces a clean, human-readable report with the top prediction highlighted and all five ranked below it. MapResultsFormatter.java Some MXNet model variants return results as a Map<String, Float> instead of a Classifications object. This formatter handles that case: Java package com.example.imageclassifier.processor; import org.apache.camel.Exchange; import java.util.ArrayList; import java.util.List; import java.util.Map; /** * Bean to format Map-based classification results into a text report. * Handles HashMap output from MXNet models. */ public class MapResultsFormatter { public String format(Map<String, Float> results, Exchange exchange) { StringBuilder sb = new StringBuilder(); String fileName = exchange.getIn().getHeader("CamelFileName", String.class); sb.append("Image: ").append(fileName).append('\n'); // Convert to sorted list by probability (descending) List<Map.Entry<String, Float>> sortedResults = new ArrayList<>(results.entrySet()); sortedResults.sort((a, b) -> Float.compare(b.getValue(), a.getValue())); // Get top 5 List<Map.Entry<String, Float>> top5 = sortedResults.subList(0, Math.min(5, sortedResults.size())); if (!top5.isEmpty()) { Map.Entry<String, Float> topEntry = top5.get(0); sb.append("Top Prediction: ").append(topEntry.getKey()) .append(" (Confidence: ").append(String.format("%.2f%%", topEntry.getValue() * 100)) .append(")\n\n"); } sb.append("Top 5 predictions:\n"); for (int i = 0; i < top5.size(); i++) { Map.Entry<String, Float> entry = top5.get(i); sb.append(String.format("%d. %s: %.2f%%\n", i + 1, entry.getKey(), entry.getValue() * 100)); } return sb.toString(); } } Since a Map has no inherent ordering, we sort the entries by value in descending order before pulling out the top 5. The output format mirrors ClassificationsFormatter exactly, so downstream consumers don’t need to care which formatter produced the report. FallbackFormatter.java In case of an unexpected output type, the FallbackFormatter makes sure the pipeline keeps producing meaningful output rather than crashing. This follows a critical production pattern - fail softly: Java package com.example.imageclassifier.processor; import org.apache.camel.Exchange; /** * Bean to format unexpected result types into a text report. */ public class FallbackFormatter { public String format(Object result, Exchange exchange) { StringBuilder sb = new StringBuilder(); String fileName = exchange.getIn().getHeader("CamelFileName", String.class); sb.append("Image: ").append(fileName).append('\n'); sb.append("Raw result type: ").append(result == null ? "null" : result.getClass().getName()).append('\n'); sb.append("Result:\n").append(String.valueOf(result)).append('\n'); return sb.toString(); } } How to Run the Application Build and run using Gradle: gradlew clean run. Then drop JPEG images into data/input/. For example, place a photo of a dog. The classification result is written to data/output/, and the original image is archived to data/classified/ with a timestamp. Example output: Plain Text Image: golden_retriever.jpg Top Prediction: golden retriever (Confidence: 95.67%) Top 5 predictions: 1. golden retriever: 95.67% 2. Labrador retriever: 2.34% 3. tennis ball: 1.12% 4. cocker spaniel: 0.45% 5. Irish setter: 0.23% The model recognizes 1,000+ ImageNet categories - animals, vehicles, everyday objects, food items, plants, and more. Sentiment Analysis vs. Image Classification: Side by Side If you read my previous article on building a sentiment analysis pipeline with Camel and DJL, you’ll notice a deliberate symmetry between the two projects. The table below highlights the key differences: Aspect Sentiment Analysis Image Classification DJL Task Type nlp/sentiment_analysis cv/image_classification Model DistilBERT (PyTorch) ResNet (MXNet) Input Text files (.txt) JPEG images (.jpg, .jpeg) Input Preprocessing Files.readString() → String Files.readAllBytes() → byte[] DJL Engine PyTorch MXNet Output Positive/Negative with confidence Top 5 category predictions Formatter Count 2 (Classifications + Fallback) 3 (Classifications + Map + Fallback) The core Camel route structure — file ingestion, DJL inference, content-based routing, and formatted output — is identical. That’s the power of the Camel + DJL integration: switching from NLP to computer vision is essentially a URI change and a different set of dependencies. The integration pattern stays the same. DJL Behind the Scenes On first execution, the ResNet model (~100MB) is downloaded automatically from the DJL Model Zoo, and MXNet native libraries are initialized. The model is cached locally under ~/.djl.ai/, so subsequent runs load from cache, making startup significantly faster. The DJL component handles all the heavy lifting internally: image decoding, resizing to the model’s expected input dimensions, tensor conversion, forward pass through the neural network, and softmax normalization of the output probabilities. You don’t write any of this code - the Camel DJL component abstracts it away entirely. Production Considerations For performance, always warm up the model on startup if latency is a concern. The first inference call triggers model loading and JIT compilation, which can take several seconds. Allocate sufficient JVM heap: image classification models are memory-intensive and typically require 500MB–1GB. Scale horizontally with multiple Camel instances watching different input directories, or vertically using GPU-enabled DJL engines. MXNet supports CUDA out of the box— swap the mxnet-native-mkl dependency for mxnet-native-cu* to enable GPU acceleration. The content-based router with a fallback formatter makes sure the pipeline doesn’t crash on unexpected model output. For production deployments, consider adding Camel’s onException handler for retries and dead-letter routing. And Camel’s built-in metrics and JMX support give you visibility into processing rates, error counts, and route performance, critical for production ML pipelines. Conclusion This tutorial demonstrates that computer vision doesn’t need to be a separate system. With Apache Camel and DJL, image classification becomes just another step in your integration flow — composable, observable, and production-ready. There’s no per-request API cost, image data stays on-premise, and you have full control over routing and error handling. Compared to calling external vision APIs (Google Vision, AWS Rekognition, Azure Computer Vision), you get zero network latency for inference, no data leaving your infrastructure, and predictable cost regardless of volume. Compared to standing up a Python Flask service with TensorFlow or PyTorch, you get native integration with enterprise Java systems and first-class support for Enterprise Integration Patterns. If you already use Camel, adding computer vision capabilities is no longer a leap. It’s a small, well-structured step.

By Vignesh Durai

Top Frameworks Experts

expert thumbnail

Justin Albano

Software Engineer,
IBM

I am devoted to continuously learning and improving as a software developer and sharing my experience with others in order to improve their expertise. I am also dedicated to personal and professional growth through diligent studying, discipline, and meaningful professional relationships. When not writing, I can be found playing hockey, practicing Brazilian Jiu-jitsu, watching the NJ Devils, reading, writing, or drawing. ~II Timothy 1:7~ Twitter: @justinmalbano

The Latest Frameworks Topics

article thumbnail
Logging What AI Agents Do in Salesforce: A Simple One-Object Audit Framework
Log every AI agent action to one custom object, and force the LLM to include a reasoning field in every tool call so you always know why it did what it did.
June 9, 2026
by Ronith Pingili
· 446 Views · 1 Like
article thumbnail
Token Attribution Framework for Agentic AI in CI/CD
A practical framework for tracking attribution, setting budgets, and circuit-breaking spending on LLM in your CI/CD pipeline by using an OpenTelemetry implementation.
June 9, 2026
by Intiaz Shaik
· 4,567 Views
article thumbnail
Stop Choosing Sides: An Engineering Leader's Framework for Build, Buy, and Hybrid AI Agents in 2026
47% of enterprises already run a hybrid AI agent model — combining off-the-shelf tools with custom development. Most are doing it accidentally.
June 8, 2026
by Amit Srivastava
· 818 Views · 1 Like
article thumbnail
How to Interpret the Number of Spring ApplicationContexts in Integration Tests
When optimizing Spring Boot integration tests, developers often focus on obvious metrics, but they do not always explain why an integration test suite is slow.
June 8, 2026
by Constantin Kwiatkowski
· 931 Views
article thumbnail
The Middleware Gap in AI Agent Frameworks
Most agent frameworks observe model calls and allow rewriting them only after they reach the model, making an understanding of callbacks and middleware essential.
June 8, 2026
by Ninaad Rao
· 949 Views · 1 Like
article thumbnail
Liquid Glass, Material 3, and a Lot of Plumbing
New iOS Modern (liquid glass) and Android Material 3 native themes, how they work in the Playground, in the simulator, and on devices, plus a week of performance and look
June 4, 2026
by Shai Almog DZone Core CORE
· 1,909 Views · 2 Likes
article thumbnail
Building Threat Intelligence Pipelines Using Python, APIs, and Elasticsearch
STIX/TAXII in, ECS normalized, provenance preserved deterministic IDs, correct bulk writes, ingest pipelines keep threat indicator data reliable and queryable under load.
June 3, 2026
by Krishnaveni Musku
· 2,389 Views
article thumbnail
Optimizing Databricks Spark Pipelines Using Declarative Patterns
This article explains why hand-tuning Spark is becoming the slow path — and what the declarative alternatives actually look like in production.
June 1, 2026
by Seshendranath Balla Venkata
· 1,151 Views
article thumbnail
Zero-Downtime Deployments for Java Apps on Kubernetes
Achieve zero-downtime deployments for Java applications on Kubernetes using rolling updates, readiness/liveness probes, and graceful shutdown strategies.
May 29, 2026
by Ramya vani Rayala
· 3,515 Views
article thumbnail
Stateless JWT Auth Microservice Architecture With Spring Boot 3 and Redis Sentinel
Design a stateless JWT auth service with Spring Boot 3, Redis caching, and Sentinel for high availability, faster token validation, and reduced DB load.
May 27, 2026
by Erkin Karanlık
· 3,184 Views · 1 Like
article thumbnail
Kafka and Spark Structured Streaming in Enterprise: The Patterns That Hold Up Under Pressure
The streaming patterns that survive in the enterprise are those built for scale, failure recovery, and long-term operability.
May 27, 2026
by Kuladeep Sandra
· 6,566 Views
article thumbnail
Building a Reusable Framework to Standardize API Ingestion in an On-Prem Lakehouse
A reusable ingestion framework standardizes connector behavior, reducing maintenance and making onboarding easier to scale.
May 21, 2026
by Kuladeep Sandra
· 1,563 Views
article thumbnail
Zone-Free Angular: Unlocking High-Performance Change Detection With Signals and Modern Reactivity
Angular v21 ditches Zone.js signals trigger rendering directly, toSignal() bridges Observables, markForCheck() fills the gaps.
May 20, 2026
by Bhanu Sekhar Guttikonda DZone Core CORE
· 1,795 Views · 1 Like
article thumbnail
Ujorm3: A New Lightweight ORM for JavaBeans and Records
Learn about Ujorm3, a new lightweight Java ORM for JavaBeans and Records. Enjoy type-safe native SQL mapping, zero reflection, and high performance.
May 19, 2026
by Pavel Ponec
· 1,907 Views · 1 Like
article thumbnail
OpenAPI From Code With Spring and Java: A Recipe for Your CI
Have you ever needed to generate OpenAPI documentation directly from your code and, more importantly, do it in a way that fits cleanly into a CI pipeline?
May 19, 2026
by Roman Dubinin
· 3,336 Views · 4 Likes
article thumbnail
Smart Deployment Strategies for Modern Applications
Docker packages applications to ensure consistent and portable deployments. Kubernetes manages them with scaling, reliability, and automation in production.
May 18, 2026
by Manju George
· 3,514 Views
article thumbnail
Optimizing High-Volume REST APIs Using Redis Caching and Spring Boot (With Load Testing Code)
Cache reads with Redis, use @CachePut for write-through consistency, and prevent stampedes with distributed locks, then prove it works under load with JMeter.
May 18, 2026
by Mallikharjuna Manepalli
· 1,390 Views
article thumbnail
Spring CRUD Generator v1.1.0 Updates
Spring CRUD Generator v1.1.0 bootstraps Spring Boot backends from YAML, adding validation, Redis caching fixes, OSIV control, and support for Spring Boot 3/4.
May 18, 2026
by Marko Zivkovic
· 959 Views
article thumbnail
Swift Concurrency Part 4: Actors, Executors, and Reentrancy
Swift concurrency eliminates data races with actors, but reentrancy and the runtime model mean you still need to think carefully about state and execution.
May 18, 2026
by Nikita Vasilev
· 1,592 Views · 1 Like
article thumbnail
Building an Image Classification Pipeline With Apache Camel and Deep Java Library (DJL)
This tutorial shows how to build an image classification pipeline entirely in Java using Apache Camel and Deep Java Library (DJL).
May 15, 2026
by Vignesh Durai
· 2,934 Views · 1 Like
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • ...
  • Next
  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook
×