Let over map merge

Expert Clojure Workflows for AI Agents: Four Skills from Production Experience

Thu, 14 May 2026 00:00:00 +0000

When you let an AI agent write Clojure code, you expect it to leverage the language's superpowers—the REPL's interactivity, structural editing, format-preserving code manipulation, and the rich ecosystem of wrapper libraries. Instead, what you typically see is mediocre code written slowly, as the agent makes the same mistakes every developer learns to avoid.

I discovered this the hard way.

The Setup: Vibe Coding with Observations

While building lite-crm with Claude Code, I deliberately avoided the –dangerously-skip-permissions flag. Instead, I sat beside the agent and watched it work—observing its patterns, frustrations, and failures. What I saw was an agent trained on millions of codebases but ignorant of how Clojure practitioners actually think.

Three concrete problems emerged:

Problem 1: The Wrapper Library Blind Spot

When encountering Java interop, the agent jumps straight into direct interoperability without ever asking: "Is there a Clojure wrapper library for this?"

The result: Uglier code, harder to maintain, and a missed opportunity for idiomatic Clojure.

Problem 2: Formatting Brittleness

Code formatters like cljfmt are essential—but they create a sneaky problem. When the agent modifies source and the formatter shifts indentation by a single space, the agent's subsequent str_replace operations fail due to whitespace mismatch.

The result: I watched it fail, retry, fail again, then give up and rewrite entire files. Enormous token waste.

Problem 3: Primitive Debugging

When a test failed, the agent fell back on the crudest debugging technique: add println statements, run the test, inspect output, delete the logs, restore the code. Repeat.

This is especially wasteful in a Clojure project where I've provided direct access to the REPL via the brepl CLI. The agent could inspect values interactively, test hypotheses instantly, and trace execution without touching source code. But it never did.

The Recognition

These weren't knowledge gaps. They were behavioral gaps—places where the agent's default approach conflicted with Clojure expertise.

In the context of Clojure Stack Lite (which includes proper testing harness and real database, not mocks), the agent wasn't just writing suboptimal code—it was making design decisions based on unfamiliar tools.

I decided to address this not by teaching the agent more facts, but by redirecting its behavior.

Four Skills to Close the Gap

The result is four skills, each targeting a specific behavioral pattern that distinguishes novice agents from expert Clojure practitioners:

1. clj-debug: From Logging to REPL Inspection

The Problem: Agents default to adding println, tap>, or logging statements, then running tests to inspect output.

The Pattern: In Clojure, this is backwards. The REPL lets you pin a value with def, explore its structure instantly, test hypotheses interactively—all without modifying code.

What the skill does: When you're about to debug, clj-debug redirects from logging patterns to REPL-based inline inspection. It teaches the agent to use def, keys, keyword access, and structural exploration—the actual workflow expert Clojure developers follow.

Behavioral change: From edit-test-inspect cycle to interactive REPL inspection. This is faster, non-invasive, and gives immediate feedback.

2. clj-discover: Systematic API Exploration

The Problem: When encountering unfamiliar Java classes or macros, agents jump to direct integration without exploring whether an idiomatic Clojure wrapper already exists.

The Pattern: Expert Clojure developers follow a deliberate workflow:

Search for a Clojure wrapper library first (usually there is one)
If not, inspect the Java class via reflection
For macros, expand them to understand what code they generate

What the skill does: clj-discover codifies this workflow, ensuring the agent prioritizes idiomatic libraries and systematic exploration before writing integration code.

Behavioral change: From direct interop to research-first integration. The result is cleaner, more maintainable code.

3. clj-replace: Format-Aware Structural Replacement

The Problem: Code formatters shift indentation by spaces, breaking text-based str_replace. The agent then wastes tokens failing repeatedly or rewriting entire files.

The Pattern: Clojure is homoiconic—code is data. Two S-expressions are semantically equivalent even if formatted differently. Expert editors handle this automatically via structural editing.

What the skill does: clj-replace compares code by structure (S-expression equivalence) rather than text, ignoring whitespace while preserving the original file's formatting style. It uses the rewrite-clj library to parse, match, and replace nodes safely.

Behavioral change: From brittle text matching to robust structural matching. Formatting variations become irrelevant.

4. clj-refactor: Mechanism/Policy Separation

The Problem: Without guidance, agents write tangled code where reusable mechanisms are mixed with business policy, creating inflexible designs that accumulate technical debt.

The Pattern: Arne Brasseur's mechanism/policy separation principle is core to building maintainable Clojure systems. Mechanism is context-free, stable, and reusable. Policy is opinionated, domain-specific, and volatile. Expert developers keep these separate.

What the skill does: clj-refactor scans code for opportunities to extract mechanisms from policy—functions where hard-coded values or implicit context can be made explicit, dependencies can be pushed to parameters, and reusable logic can be isolated.

Behavioral change: From monolithic functions to extracted, composable mechanisms. Code becomes easier to test, reuse, and reason about.

Note: Unlike clj-debug, clj-discover, and clj-replace—which activate automatically when the agent encounters problems—clj-refactor is user-initiated. You invoke it when you want the agent to analyze code for refactoring opportunities, not in response to a failure.

Why This Matters

These aren't reference manuals or API documentation. They're workflow redirects—rules that teach AI agents to think like expert Clojure developers instead of generic code writers.

The underlying philosophy is simple: A skill's value is measured by behavioral change, not knowledge transfer.

When an agent uses clj-debug, it stops adding logging. When it uses clj-discover, it checks for idiomatic wrappers before raw interop. When it uses clj-replace, formatting becomes irrelevant. When you invoke clj-refactor, the agent identifies tangled mechanisms and suggests extraction. Each skill shifts the agent's default patterns closer to expert practice.

This matters because Clojure is a language of leverage. The REPL, immutability, homoiconicity, and the functional approach all reward practitioners who use them correctly. An agent that doesn't leverage these features isn't just writing slow code—it's missing the point of the language.

The goal is simple: your AI agent shouldn't just write Clojure code—it should think like a Clojure developer. These four skills make that possible.

Find them at: github.com/humorless/clj-native-agent

Agent-Ready Stack

Mon, 11 May 2026 00:00:00 +0000

I keep seeing people share vibe-coded apps built on TypeScript/React + Supabase — seemingly the default recommendation from Lovable or Cursor. As a Clojure programmer, I can't stay quiet about this. In an era where AI agents are deeply embedded in the development workflow, that choice carries structural hidden costs that almost nobody is talking about.

Context Window Is the Bottleneck, and Framework Design Determines Burn Rate

LongCodeBench research shows that Claude 3.5 Sonnet's accuracy on bug-fixing tasks drops from 29% to 3% as context grows from 32K to 256K tokens. Chroma tested 18 frontier models and found the same pattern across all of them.

Coding agents accelerate this degradation: every tool call, every file read, every error message accumulates in the context. A 30-step agent session can consume more than ten times the context of a single conversation turn.

Countless efforts are already underway to manage context from the harness-design side — but the tech stack itself has an enormous impact on context efficiency that rarely gets discussed.

Task-Relevant Subgraph

An AI agent completing a task doesn't need to read the entire codebase — only the files relevant to that task. Call this set the task-relevant subgraph. The size of the subgraph is determined by the architectural design of the framework, not by the model.

The problem with TypeScript + React + Supabase is that a single feature naturally spans multiple layers — component, hook, state, API client, type definition — each living in a different file. The subgraph starts large and only grows as shared dependencies accumulate.

AI tends to recommend the stack it was trained on the most, but "easy to generate" is not the same as "efficient for long-term AI-assisted development." These are two different things.

What Makes a Stack More Agent-Ready

My current go-to is Clojure Stack Lite, and several of its design choices structurally shrink the task-relevant subgraph.

HTMX eliminates implicit client state. React state is scattered across multiple interdependent files; to verify behavior, an agent has to simulate browser interactions. HTMX is driven by server responses, so an agent can verify with a plain curl — the response is an HTML fragment, right or wrong, no ambiguity.

HoneySQL eliminates implicit lazy loading. When an ORM produces an N+1 problem, the debug subgraph includes model definitions, association configs, and migration files, because the issue is buried in implicit behavior. HoneySQL expresses queries as SQL-as-data — no lazy loading, no association magic. N+1 can't happen silently, because the syntax simply doesn't allow it to sneak in. The debug subgraph shrinks from five files to one.

Blocking IO eliminates implicit error paths. The fundamental problem with async isn't the syntax — it's that error paths are implicit. Every async call site is a potential break point where an exception can detach from the main flow. To locate a root cause, an agent must trace the entire call chain, and context width grows linearly with chain length. Clojure's blocking IO has no async boundaries; exceptions follow a single path — propagate upward, handled uniformly in middleware. When debugging, an agent only needs two places: the middleware log and the call site the log points to. Context scope stays fixed regardless of system size.

Explicit Over Implicit Is Not Just a Clojure Virtue

All three points share a common structure: the less implicit behavior, the smaller the context an agent needs to bring in.

The point here isn't a framework or language comparison — it's an observation about design philosophy. Explicit over implicit is a virtue for human developers; for AI agents, it's a structural guarantee that they won't go dumb prematurely.

Design principles the Clojure community has championed for years happen to be a competitive advantage in the AI agent era. I've chosen to frame this in terms of context efficiency, hoping it helps more people appreciate what the Clojure community figured out a long time ago.

Teaching Clojure programming class

Tue, 05 May 2020 00:00:00 +0000

When I told others that I am a Clojure programmer, they responded apathetically. Why so many people in Taiwan never heard of this great programming language? One day, an idea occurred to me that how about teaching some students?

The advertisement

I re-wrote my advertisement again and again. What kind of value proposition would be appreciated by my prospects? I actually did not know. At the end, I wrote 3 objectives in my advertisement.

Help you learn the Clojure programming language
Help you become the real senior programmer in the eyes of your colleagues.
Help you become more confident whenever you want to ask for a raise.

My friends did not believe I could get students, and they tried to tell the uncomfortable truth mildly. They asked something like "Who is your target audience?"

Fortunately, I got two students just after I posted it. Two of my college classmates wanted to learn Clojure programming language from me.

The ways of teaching

At the very beginning, I asked my students what they want to learn. This was a one-on-one tutoring class, so it could be customized. I organized the class into 4 stages.

Development environment setup.
About the Clojure's productivity — The productivity brought by Clojure.
Practicing at the 4clojure website.
Studying any specific topics they were interested. (Customization part)

Lessions learned from teaching

Through the questions from my students I found out some obstacles in learning Clojure.

Switching between purpose view and implementation view

When I write recursion, I split it into three steps:

Choose the name of this recursion function. The name is about this function's purpose.
Think about the boundary condition of this function. When will it stop?
Write the implementation of this function. This function should be implemented by a call to itself with a different input argument and some connection code.

(defn range-to-zero [x] 
  (when (> x 0)
    (conj (range-to-zero (dec x)) x)))

Take the above code as an example. I think the output of (range-to-zero 4) is '(4 3 2 1). When I want to define the (range-to-zero 5), I just need to conj a 5 to the '(4 3 2 1).

My students did not think like this way: they simulated the execution of the code from the very top toward the boundary conditions. They organized their mind like an interpreter and they traced the code just like the interpreter did. I told my student that you need to switch your thinking between purpose view and implementation view.

Different levels of complexity

After my students solved about 50 questions in 4clojure, they felt a sense of confidence to fly alone. However, when they just solved about 25 questions, they felt quite confused. They were very confused about the idiomatic ways to solve 4clojure questions.

I considered there were 5 levels of complexity:

Solve the question by remembering the Clojure function name. For example, frequencies.
Solve the question by using some sequence questions: map, filter, mapcat.
Solve the question by using reduce.
Solve the question by using recursion.
Solve the question by using mutual recursion.

I encouraged my students to solve any questions using as fewer level of complexity as possible. There were still certain special cases like flatten, which might not fit into my categories.

Final notes

One of my students told me that he decided to learn the Clojure programming language because of Robert C. Martin's recommendation. Thanks to uncle Bob that he had done great marketing for me.

Using Datomic with disk cache and LU cache

Sun, 15 Sep 2019 00:00:00 +0000

The background of this post

I began to use Datomic seriously in my project at work from February 2019. I encountered certain performance issues and I solved them through disk cache and LU cache.

Analytical queries need pre-computation

My project had several analytical queries which implemented the business rules. With Datomic expressive query power, it was very easy to implement queries that closely related to domain model. Great for expressiveness, but the query speed was quite slow, I needed to do some pre-computation.

How to save the query results? My query results were in the form of EDN format. Should I prepare a key-value database to cache it? Or, should I use just Datomic to serve as the key-value database?

Using Datomic as key-value store

In my use cases, I used Datomic as key-value store with the following schema.

| schema name | :booking/tx  | :booking/team      | :booking/bytes |
|-------------|--------------|--------------------|----------------|
| data type   | long         |  string            |  bytes         |

The :booking/tx and :booking/team served as keys and :booking/bytes served as value. Before I stored the EDN format value into :booking/bytes, I first required a smart library: Nippy. Nippy helped to transform Clojure composite data structure into plain Java bytes.

Here came another question: Was there any size limit with the Datomic schema type: db.type/bytes? I spent some time to find the answer in Datomic google group.

I believe the rule of thumb is that values stored in Datomic — strings, bytes, etc. — should not exceed one kilobyte. Nothing will break if they do, but Datomic's storage layout is optimized for values this size or smaller.

Great! Nothing will break if they do.

OutOfMemory Error occurred

Some of my queries used not-join syntaxes. At the beginning, not-join looked like great things because with not-join I could express my intent without any low level interpretation. Soonly, the queries with not-join threw an OutOfMemory error. Therefore, I decided to do some optimizations.

Extract not-join out of query and use LU-cached memoize

The original query was like this:

(d/q '[:find (count ?r) .
       :where [?r :release/name "Live at Carnegie Hall"]
              (not-join [?r]
                [?r :release/artists ?a]
                [?a :artist/name "Bill Withers"])]
       db)

The modified equivalent queries were:

(def B (into #{}
             (d/q '[:find ?r
                    :where [?r :release/artist ?a]
                           [?a :artist/name "Bill Withers"]]
                    db)))

(d/q '[:find (count ?r) .
       :in $a $b
       :where
       [$a ?r :release/name "Live at Carnegie Hall"]
       ($b not [?r])]
     db B)

After I extracted not-join part out of the original query, I discovered that some of my new query would be called with similar inputs across several queries. It would save some computation resources if I used memoize to modify the new query.

However, the standard version clojure.core/memoize would cause memory leak. I chose the Clojure contrib library core.memoize to cache the query result with LU cache.

The to be memoized query functions were like this:

(defn memo-q*
  [db t]
  (into #{}
        (d/q '[:find ?r
               :where [?r :release/artist ?a]
                      [?a :artist/name "Bill Withers"]]
             db)))

(def memo-q (clojure.core.memoize/lu memo-q* {} :lu/threshold 2))

The memoized query function was called like this:

(memo-q db (d/basis-t db))

The Datomic t of the most recent transaction reachable via the db value served as the input parameter to decide whether the cached result was out of date.

A Clojurian's idioms and patterns for ETL

Mon, 01 Jul 2019 00:00:00 +0000

Background

I needed to do eight Excel ETLs at my project. At the beginning, I just implemented some of the ETLs without any design. I did not even implement schema validation, and then I felt the pain soon. After several re-writing, I abstracted out some idioms and patterns for ETL.

Problems

We need to import data from several Excel files into Datomic database. There are several concerns with the ETL (extract-transform-load):

Schema validation: Can we have a validation function that we only need to inject the schema and then the validation function will handle all the rest for us?
Transformation complexity: The transformation from Excel to Datomic table varies a lot. The simplest one is just copy data, but the complex ones need to look up tables in the database. How can we organize different type of transformation functions such that the functions can be more reusable and composable?
Database upsert semantic: The identity key of the database table may be compound fields, or there may be some cardinality-many fields in the database table. That is to say, the basic upsert semantic offered by Datomic is not enough.

Solution for schema validation

The library clojure.spec is great for schema validation.

;; library functions defined at utility namespace
(defn check-raw-fn
  "assemble schema and then create a validation fn"
  [schema]
  (fn check-raw
    [data]
    (if (spec/valid? schema data)
      data
      (let [desc (spec/explain-str schema data)]
        (throw (ex-info desc {:causes data :desc desc}))))))

;; Example application functions
(spec/def ::apply-time inst?)
(spec/def ::customer-id string?)
(spec/def ::lamp-customer-id string?)
(spec/def ::sales-name string?)
(spec/def ::source #{"agp" "lap"})

(spec/def ::mapping
  (spec/* 
    (spec/keys :req-un
               [::apply-time ::customer-id ::lamp-customer-id ::sales-name ::source])))

(def ^:private check-raw
  (utility/check-raw-fn ::mapping))

In this design:

Even though I do not know how many rows an Excel file may have, I can still use (spec/* ...) to represent the schema for the Excel file. If the spec does not offer the semantic like (spec/* ...), I have to write some loop logic in check-raw-fn function, which causes the context dependency.
The spec names are just the same as the column names of Excel. Keep it simple making the program more robust.
If a string has only a few possible options, represent it in the form as #{option1 option2 ...}
When throwing exception, I use (ex-info ...) and I put the output of (spec/explain-str ...) into an exception. Then, I can find out what is wrong by just reading the exception message.

Also, at the trigger API of ETL, the web API deliberately catches only certain types of Exception:

(try (if-let [r (etl/sync-data cmd filename)]
            (ok {:result :insert-done})
            (ok {:result :already-sync}))
          (catch clojure.lang.ExceptionInfo e
            (bad-request {:reason (ex-data e)}))
          (catch java.util.concurrent.ExecutionException e
            (bad-request {:reason (.getCause e)})))

The exception clojure.lang.ExceptionInfo only catches the schema validation error thrown by my application code. The exception java.util.concurrent.ExecutionException can catch the error from Datomic transaction. Other exceptions may happen with lower possibility, so I let them pass over and be recorded in log file.

Solution for transformation complexity — let over map merge

I propose a pattern, which I call it as let over map merge to handle the transformation complexity.

Consider a transformation function data->txes, both the input and the output are sequences of map:

The single map in the input data represents the row in the Excel file.
The single map in the output txes represents the row in the Datomic table.

(defn- data->txes
  "data is a sequence of {HashMap}"
  [data]
  (let [db (d/db conn)
        table (utility/tax-id->c-eid db)]
    (map #(transformation-f table %) data)))

We can easily divide the transformation into two categories:

Basic transformation: Just copy the field, or with pure function transformation.
Complex transformation: When transforming the input data, we need to also look up the database content.

If we pull out basic-mapping and complex-mapping from transformation-f, we can change the original code into

(defn- data->txes
  [data]
  (let [db (d/db conn)
        table (utility/tax-id->c-eid db)]
    (let [basic-tx (map basic-mapping data)
          complex-tx (map #(complex-mapping table %) data)]
      (map merge basic-tx complex-tx))))

With this let over map-merge pattern, we can make the granularity of the transformation function smaller so as to make them more reusable and composable. In certain cases, basic-mapping only needs to change the key-name in the hash map, so we can use clojure.set/rename-keys to implement the basic-mapping.

Solution for database upsert semantic

In Datomic, we can use the :db.unique/identity to make certain schema work like primary key in traditional RDBMS.

Compound primary key

Consider tha table with compound primary key as stream-unique-id, writing-time, source. How to do upsert when we have txes like below?

 [#:rev-stream{:stream-unique-id "AA"
               :writing-time #inst "2019-04-01T02:39:00.000-00:00"
               :source :etl.source/agp
               :campaign-name "BB"}]

With a db transaction function upsert-rev-stream, we can simply write txes as

 [[:fn/upsert-rev-stream 
   #:rev-stream{:stream-unique-id "AA
                :writing-time #inst "2019-04-01T02:39:00.000-00:00"
                :source :etl.source/agp
                :campaign-name "BB"}]]

The transaction function :fn/upsert-rev-stream handles the upsert complexity.

 {:db/id #db/id [:db.part/user]
  :db/ident :fn/upsert-rev-stream
  :db/doc "The primary key of rev-stream is compound key"
  :db/fn #db/fn
  {:lang :clojure
   :params [db m]
   :code (if-let [id (ffirst
                      (d/q '[:find ?e
                             :in $ ?u ?t ?s
                             :where
                             [?e :rev-stream/stream-unique-id ?u]
                             [?e :rev-stream/writing-time ?t]
                             [?e :rev-stream/source ?s]]
                           db (:rev-stream/stream-unique-id m)
                           (:rev-stream/writing-time m)
                           (:rev-stream/source m)))]
           [(-> (dissoc m :rev-stream/stream-unique-id
                        :rev-stream/writing-time
                        :rev-stream/source)
                (assoc :db/id id))]
           [m])}}

Cardinality many

Consider tha table with a cardinality-many schema :order/accounting-data and :order/product-unique-id with :db.unique/identity. How to do upsert when we have txes like below?

[#:order{:io-writing-time #inst "2019-04-01T02:39:00.000-00:00",
         :service-category-enum :product.type/today,
         :accounting-data
         [#:accounting{:month "2019-04", :revenue -2}
          #:accounting{:month "2019-05", :revenue -3}
          #:accounting{:month "2019-02", :revenue 4}
          #:accounting{:month "2019-01", :revenue 5}]}]

With a db transaction function upsert-order, we can simply write txes as

  [[:fn/upsert-order
    #:order{:io-writing-time #inst "2019-04-01T02:39:00.000-00:00",
            :service-category-enum :product.type/today,
            :accounting-data
            [#:accounting{:month "2019-04", :revenue -2}
             #:accounting{:month "2019-05", :revenue -3}
             #:accounting{:month "2019-02", :revenue 4}
             #:accounting{:month "2019-01", :revenue 5}]}]]

The transaction function :fn/upsert-order handles the upsert complexity.

 {:db/id #db/id [:db.part/user]
  :db/ident :fn/upsert-order
  :db/doc "The :order/accounting-data is cardinality many.
  When insert semantic, transact `[m]`
  When update semantic, do retraction of :order/accounting-data first and then transact `m`  "
  :db/fn #db/fn
  {:lang :clojure
   :params [db m]
   :code (if-let [eid (ffirst
                      (d/q '[:find ?e
                             :in $ ?u
                             :where
                             [?e :order/product-unique-id ?u]]
                           db (:order/product-unique-id m)))]
           (let [ad-refs (d/q '[:find [?a ...]
                                :in $ ?e
                                :where [?e :order/accounting-data ?a]]
                              db eid)
                 retracts (mapcat (fn [r]  [[:db/retractEntity r]
                                            [:db/retract eid :order/accounting-data r]]) ad-refs)]
             (conj (vec retracts) m))
           [m])}}

Conclusions

From abstracting out idioms and patterns of ETL, I understand that context dependency is the primary cause of the complex application code. Both Datomic transaction functions and regular expression syntaxes of clojure.spec can help to remove the context dependency of our application code. Use them wisely!

Lessons learned from the software consulting job

Sun, 23 Jun 2019 00:00:00 +0000

I live in Taiwan and I can not find Clojure jobs here. Although the first legal gay wedding in Asia took place here, it seems that the real programming language innovation still needs some evangelists to spread it. Therefore, I decide to create Clojure job by myself. In January this year, I had a chance to develop enterprise software for a big company, and I chose Clojure as my primary technical stack.

Technical stack issues

When I discussed with my clients about this enterprise software solution, we focused on the problem domain. However, when I told my clients that I want to use Clojure, Datomic, and ClojureScript, my clients said no. They said a lot of cliches like they never hear Clojure before, it would be difficult to find Clojure programmers. Then, I made some compromises: I would use React with javascript in frontend but Clojure in backend with Datomic as database. For Clojure, I provided the reason that the business requirements had temporal queries which were like a piece of cake for Datomic but very time-consuming for traditional relational databases.

After developing this project for a while, I regretted that I did not insist on ClojureScript. I really spent a lot of time on javascript boilerplate code, and the time spent did not bring any value to my clients.

A very simple user login is good enough for a small group of users

The enterprise software solution needed to be an on-premise solution, installed on the private network at company offices. There would be about 30 users login everyday. At the beginning, I thought three different ways to solve the user login problems:

Single signed-on with other enterprise software already existed
Leverage third party authorization service
Traditional user login backend APIs and frontend UI with login/register/user management functions like resetting password.

Option 2 might be fast enough, but my clients did not like third party service.

My final proposal was a login module like this:

Frontend UI provided the login and password modification functions to ordinary users.
The administrator of this system used ETL (extract-transform-load) to manage user accounts. Given this design, we did not need any user registration or user accounts management UI.

Revenue spreading problem

There was a business requirement, I called it as revenue spreading problem, in this enterprise software.

Revenue spreading problem:

For every order, there is a start date and end date of this order. The total days of an order are (end date - start date + 1)
For every order, there is a net revenue of this order.
For every order, we need to calculate the monthly revenue. The definition of monthly revenue is net revenue * the revenue days of certain month / total days

If an order starts at 5/5, ends at 6/8 with total revenue as 35 dollars, then the total days of this order is (27+8) = 35 days. Also, the monthly revenue of May is 27 dollars and monthly revenue of June is 8 dollars.

To solve this, at the beginning, I used first-day-of-the-month and last-day-of-the-month in clj-time library to calculate how many days within a month. The first version solution was a traditional imperative solution. I quickly found that I could do better with functional thinking.

My improved version:

Generate a sequence of time using period-sec in clj-time. The period of time is just one day long and the start date/end date are the start date/end date of certain order.
Apply group-by to the step 1 day sequence with the grouping function that can return the year-month-string of a certain date. For example, a date of 2009/05/01 returns "2019-05".
Calculate how many days of each group of the step 2 result.
Spread the revenue using step 3 result.

CI/CD issues

I was not an expert of DevOps. When I needed to deploy the project, I took some time to study ansible because the great book Deploying Your First Clojure App ...From the Shadows shows introduced ansible. I still felt ansible is a great tool worth learning, however, the target servers were under the bastion host.

Engineers in the same company told me that they installed a Drone CI/CD server in the virtual private network behind the bastion host. As a Clojure developer, I decided to use LambdaCD. Actually, it was even simpler than Drone. Parentheses abundant lisp clj files were more expressive than yaml files.

When I encountered problems, I asked questions at LambdaCD github repo. Within two days, the author of LambdaCD kindly replied my questions. I thought LambdaCD is worth of recommendation, both the quality of the software and quick response.

Evangelism of Clojure

Given that I did software consulting at a big company, I could apply for technical talk inside the company. Grabbing the chance, I introduced Clojure to 10~ developers. Those who already had experience with Scala showed more interests than others. Good beginning anyways, I thought. Here is the slide of technical talk.

Using datomic with Luminus: Where to put our queries?

Wed, 12 Jun 2019 00:00:00 +0000

If we build a Luminus project with db option other than datomic, for example +postgres, the code arrangement is much more straight forward. Open the file resources/sql/queries.sql, and put sql query and sql transaction command in this file. Then, we can just require the xxx.db.core namespace, the db queries or commands are totally available.

Where to put the db queries if we use db option as +datomic?

Put datomic queries in the same file with connection state in xxx.db.core is the first attempt I tried. However, the datomic queries actually execute in the application program runtime, not in the db server runtime. Also, if we design the query function to accept datomic db value as input argument, then our query function will become pure functions.

After discovering that our query functions are pure functions, I decide to arrange my application namespaces like this:

prj.[service].assembly ---> prj.db.core
                            ;; assembly only refers conn variable from prj.db.core
                       ---> datomic.api
                       ---> prj.db.query
                             ;; I make all the query functions as pure functions and put them here.

The namespace [service].assembly is used to wire utility funcitons (pure functions) and stateful things like datomic connection together.

Where to put the db transactions?

Given that [service].assembly refers conn, I decide to call (d/transact conn ... ) in this namespace. However, I still need to do some transformation to get proper transaction data that can directly put into d/transact. Therefore, the arrangement will be like:

prj.[service].assembly ---> prj.db.command

In prj.db.command, I put the transformation functions that used to create datomic transaction data. The transformation functions are also pure functions.

Conclusion

Compared to traditional sql db option, the reasonable place to put database queries of datomic db option is totally different.

In traditonal sql db options:

We write HugSQL sql sourcre files with sql and tags.
We need integration test to test these queries.
We place our queries in resource/sql/queries.sql

In datomic db options:

We write Clojure source files with data.
We only need unit test to test these queries.
We place our queries in prj.db.query namespace.

Clojure development environment by Vagrant

Mon, 13 May 2019 00:00:00 +0000

If you want to have a portable Clojure development environment and you use Vagrant, vim-fireplace, you may consider to try my Vagrantfile.

git clone https://github.com/humorless/dotfiles
cd dotfiles
vagrant up

Certain part of vagrantfile you may need to remove.

if Vagrant.has_plugin?("vagrant-timezone")
  config.timezone.value = "Asia/Taipei"
end

The beginning of this repo

Several years before, I created a github repo called dotfiles, which is used to record my vimrc file. Later, every time when I changed my job, I modified my favorite vim plugin. I modified my vim plugin collection so many times. Sometimes, I installed certain vim cool plugin, but after a while, I totally forgot how to use it. There are not too many vim plugins in this dotfiles, because I am not a vim l33t hax0r.

development and deployment

I have had a job that I needed to work at AWS cloud9 environment. Some of my jobs required me to install totally new development environment. Recently, I needed to deploy Clojure enviroment on production system, so I learned a little ansible and I used ansible to install java8.

One day, I found that vagrant can use ansible to do provisioning, so I combined them together.

Some nice tools I cannot live without

nvm is important to me because I usually need to change node version. autojump is also important.

Using Datomic in my app

Sat, 27 Apr 2019 00:00:00 +0000

The background of this post

I began to use Datomic seriously in my project at work from February 2019. Now, it is time to write down certain experience. When I just began, I found a lot of documents talking about how to use Datomic. However, I still found certain points worth to mention from my project.

Query API and Pull API are enough

When I just begin to write Datomic, soon I found post from Val. In the post, Val used Entity API.

In my project, I used only Query API and Pull API. Query API was for taking out entity id mostly and Pull API was for pulling out necessary field or sometimes doing some 'join'. I think the article SEPARATION OF CONCERNS IN DATOMIC QUERY: DATALOG QUERY AND PULL EXPRESSIONS has explained similar idea. Entity API is also good, but Pull API is even better.

Occasionally, a generalized CAS (compare-and-swap) is needed, or you need to use stamp.

In my project, I need to use Datomic to model:

The user can propose request. Initially, the request is in open status.
The admin can approve/reject/modify the user request.

The request schema is like:

:req/status     ;; cardinality one. It can be - open, modified, approved, rejected
:req/things     ;; cardinality many. [thing-id ...]

The admin sees the user requests from a web application UI. There are three options for admin: approve, reject, modify. If a request is approved or rejected, then this request is no longer alive. It will disappear from admin UI. However, if a request is modified, it can still be approved, be rejected, or be modified again. When the request is modified, only the req/things can be modified. There may be multiple admins operating at the same time on the same request in this system.

The state diagram of request status is:

 open -> modified 
 modified -> modified 
 {modified, open} -> approved (done)
 {modified, open} -> rejected (done)

Consider a situation: Two admins A and B process on the same request and they do not sense each other. They push the button at the same time. One admin A approves the request and another admin B modifies the request. The request was originally modified before, so it is at the status modified when the two admins process it.

The correct behavior of the system could be two possibilities: Either operation of admin A is successful or operation of admin B is successful. If operation of admin A is successful first, then the request can not be modified anymore. If the operation of admin B is successful first, then the approval of A should not happen, because the req/things is already modified, but the admin A approved different set of req/things.

I consider to utilize db.fn/cas to guarantee that only one operation of admin A or admin B can succeed. However, db.fn/cas does not work on attributes with cardinality many.

I think there are two ways to solve this mutually exclusive concurrent operation problem:

Add an extra schema req/stamp into req. The stamp is initially 0. Every operation will increase it by 1. Then I can use this stamp and db.fn/cas to ensure the logically strictness of the operations.
Install some customized db function, which can do CAS on cardinality many to ensure the logically strictness.

DB Enumeration

I use :db/ident to do enumerations in my project:

[:db/add #db/id [:db.part/user] :db/ident :product.type/account]
[:db/add #db/id [:db.part/user] :db/ident :product.type/display]

They are enumerations that represent the different products. Then, there are certain related issues associated with this modeling.

How to pull out all the enumerations of the same type?

I deliberately set the enumeration of the same type with the same namespace, so I need to prepare a query that can filter based on the same namespace. It is very convenient that we can directly use Clojure function in Datomic query.

(defn product-enum-eids
  "all the product enumeration eids"
  [db]
  (d/q '[:find [?e ...]
         :in $ ?nsp
         :where [?e :db/ident ?attr]
         [(namespace ?attr) ?nsp]]     ;;Datomic Function expression binds the ?nsp variable
       db "product.type"))

How to store the external string and enumeration mapping in Datomic?

Once again, I use simple schema with no magic.

   {:db/doc "External name associated with a db enumeration value"
    :db/ident :enum/name
    :db/valueType :db.type/string
    :db/cardinality :db.cardinality/one
    :db/unique :db.unique/identity
    :db/id #db/id [:db.part/db]
    :db.install/_attribute :db.part/db}

   {:db/doc "db enumeration value"
    :db/ident :enum/value
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/one
    :db/unique :db.unique/identity
    :db/id #db/id [:db.part/db]
    :db.install/_attribute :db.part/db}

When we need to import data from files and we need to map external names to DB enumeration values, we can pull out all the mapping at once.

(defn name2enum-table
  "create a mapping table that can lookup enumeration from string name."
  [db]
  (into {}  (d/q '[:find ?k ?enum
                   :where
                   [?e :enum/name ?k]
                   [?e :enum/value ?v]
                   [?v :db/ident ?enum]]
                 db)))

REPL tips

Sat, 30 Mar 2019 00:00:00 +0000

從今年 2 月開始，接了一個公司內部應用軟體的專案開發，我用 clojure + luminus + datomic 來實作。不知不覺也就每天寫 clojure 的 REPL 近兩個月了。每天玩 REPL 之後，很快就發現一些過去我用 REPL 的盲點。

沒有善用 `clojure.repl/pprint`

沒有善用的主要原因，自然是因為在 fireplace.vim 的環境下，一開始我沒有特別做一些設定時，直接做 cpp, cqp 之類 REPL 操作，並不會有 pretty print 的輸出。後來，我總算是下定決心，把 leiningen profiles 設定好，加入了一個叫 vinyasa 的 leiningen dependency

設定好之後，就可以用 (>pprint ...) 來做 pretty print 。

沒有善用 `1` `2`

過去，我在做 REPL 操作時，常常做的事情是這樣子：

(f1 a b c) => 試到結果正確

(f2 (f1 a b c) d) => 也是試到結果也正確

(f3 (f2 (f1 a b c) d) e) => 然後指令就愈來愈長, 愈來愈難下

其實不用這樣子麻煩，第二次可以這樣子下指令 (f2 *1 d) 。

dependency injection with Clojure

Wed, 12 Jul 2017 00:00:00 +0000

寫 clojure 的時候，雖然套用了 REPL-driven development 的開發方式，已經相對可以讓大多數的函數很快地做過測試。但是，隨著要開發的專案愈來愈大，還是一樣需要用標準的寫法來寫單元測試 (unit test) 。有一個非正規的統計，如果是 Ruby on Rail 的專案，一般而言，90% 的函數都是有副作用的。然而， clojure 語言的專案，往往只有 40% 的函數帶有副作用。

即使是寫 clojure 語言，還是會遇到有 side effect 的函數，那比較好的寫法是怎麼樣呢？

我查了一下 stackoverflow 之後，很快就找到了一個很好用的函數 with-redefs 。 stackoverflow 上的答案大意如下：由於 clojure 語言有 Dynamic binding 的特性，使用 with-redefs 就可以實現同樣的語意了。

我試了一下，還真的管用，範例如下：

(deftest platform-contact-test
  (testing "platform-contact"
    ; use the DI technique to test the function platform-contact
    (is (= 170
           (with-redefs [get-platform-contact (fn [_] (slurp "./resources/contact_data.txt"))]
             (count (platform-contact (temp-platform-all))))))))

在這個範例中，原本的 get-platform-contact 函數是一個有副作用的函數，它會被 platform-contact 函數呼叫。 get-platform-contact 函數會發出一個 http request ，並且傳回遠端 server 上的資料，所以如果沒有加以代換，單元測試就會非常慢。用了 with-redefs 之後，就可以輕易地將 get-platform-contact 代換成一個會傳回固定檔案資料的函數，如此就可以執行快速的單元測試了。

對於 clojure 這種先進的特性， stackoverflow 上有一句評論： Needing a framework for DI is really just compensating for a lack of sufficient features in the language itself.

groupby

Sun, 21 May 2017 00:00:00 +0000

一開始是我在寫 4clojure 的練習題的時候，寫到了一個題目，要重新實現 clojure 語言的 groupby 函數。我糾結了好一陣子，又查了不少資料，才勉強用 reduce 寫出來。然而，最近卻在工作中，用上了 groupby 。

(fn f [k coll]
  (reduce
    (fn [c v]
      (update-in c [(k v)] (fnil conj []) v))
    {} coll))

工作上遇到的問題是要重構同事寫的程式碼。程式碼做的事情是：「接受資料庫 dump 的 json 輸出，跑兩層很複雜的迴圈，對原始的資料做主鍵交換的處理，然後將資料存入 mysql 資料庫。」資料庫 dump 出來的 json 大概長成如下的樣子：

  "result": [
    {
      "platform": "c01.i01",
      "ip_list": [
        {
          "ip": "192.168.0.1",
          "hostname": "ggyy6699"
        },
        {
          "ip": "192.169.1.1",
          "hostname:": "ggyy7700"
        }
      ]
    },
    {
      "platform": "c01.i05",
      "ip_list": [
        {
          "ip": "192.168.0.2",
          "hostname": "ggkk8899"
        },
        {
          "ip": "192.169.1.2",
          "hostname:": "ggkk9900"
        }
      ]
    }
  ]
}

從這個 json 來看的話，platform 是主鍵 (primary key) 。而每一個 platform 下之下會有多個 hostname 。而程式碼做的事情是，先解析這個 json ，重新整理之後，讓 hostname 變成主鍵 (primary key) ，並且做成一行又一行的 row ，最後要存入關聯式資料庫。讓我感到困擾的地方是因為整理屬性與屬性之間複雜關系的程式碼，都塞在雙重迴圈裡頭，所以雙重迴圈就變得很複雜，而且這一段雙重迴圈的程式碼也無法複用，難以修改、難以維護。

轉換成用資料庫的觀點來看待這個問題之後，就得到了還不錯的解法：

資料庫的 dump 輸出，本質上也是 join 兩張資料表的結果輸出，所以主鍵 (primary key) 本來就有可能交換。
既然要解析的資料是 join 之後的結果，所以有效的處理方式是這樣子：
1. 先將 json 的資料跑完簡單的雙重迴圈，雙重迴圈只做一件事，只將將資料做展開 (unfolding)，變成 join 完成的樣子。
2. python 的 itertools.groupby ，可以讓資料表 (table) 重新整理，產生出以任意的 column 做為主鍵 (primary key) 的新資料表 (table)。

程式碼如下：

def get_h_platforms(res):
    """ sample output
    ctl-zj-061-130-028-019 ['c01.p02', 'c01.p02-kugou']
    ctl-zj-061-130-028-020 ['c01.p02', 'c01.p02-kugou']
    ctl-zj-061-130-028-022 ['c01.p02', 'c01.p02-kugou']
    """
    product = [(p["platform"], device["hostname"])
               for p in res["result"] for device in p["ip_list"]]
    data = sorted(product, key=lambda x: x[1])
    for key, grp in itertools.groupby(data, key=lambda x: x[1]):
        print(key, list(map(lambda x: x[0], set(grp))))

pattern

Tue, 28 Feb 2017 00:00:00 +0000

patterns = programming with abstactions that are not powerful enough

先來引述一下 Paul Graham 的句子

When I see patterns in my programs, I consider it a sign of trouble. The shape of a program should reflect only the problem it needs to solve. Any other regularity in the code is a sign, to me at least, that I'm using abstractions that aren't powerful enough.
Paul Graham - Revenge of the Nerds

為了想出可以妥善解釋這段話的意思的 non-trivial 範例，其實我還想了滿久的。不料真的就在我學習 clojure 語言的過程之中找到了。這個範例是對某個 array 的每一個元素，做相同的運算處理：一個是循序處理、一個是平行處理。

golang 的兩個版本

循序處理的版本

res := make([]float, N);
for i,xi := range data {
    func (i int, xi float) {
        res[i] = doSomething(i,xi);
    } (i, xi);
}

平行處理的版本

type empty {}
...
data := make([]float, N);
res := make([]float, N);
sem := make(chan empty, N);  // semaphore pattern
...
for i,xi := range data {
    go func (i int, xi float) {
        res[i] = doSomething(i,xi);
        sem <- empty{};
    } (i, xi);
}
// wait for goroutines to finish
for i := 0; i < N; ++i { <-sem }

clojure 的兩個版本

循序處理的版本

(defn myfun [coll]
  (map doSomething coll))

平行處理的版本

(defn myfun [coll]
  (pmap doSomething coll))

抽象層次的差異

比較這兩種語言寫的四段程式碼，很快可以發現，循序處理的範例都相當的簡單。然而，當換成平行處理的版本時， golang 的實作比 clojure 難多了。需要用 golang 的 channel 做出一個 semaphore 的 pattern 才能實現。而相較之下， clojure 把 map 換成 pmap 就可以了。由此可見， clojure 在這個例子之中，是一種足夠強的抽象層，可以輕易地去表達這個平行處理的語意。

Let over map merge

Expert Clojure Workflows for AI Agents: Four Skills from Production Experience

The Setup: Vibe Coding with Observations

Problem 1: The Wrapper Library Blind Spot

Problem 2: Formatting Brittleness

Problem 3: Primitive Debugging

The Recognition

Four Skills to Close the Gap

1. clj-debug: From Logging to REPL Inspection

2. clj-discover: Systematic API Exploration

3. clj-replace: Format-Aware Structural Replacement

4. clj-refactor: Mechanism/Policy Separation

Why This Matters

Agent-Ready Stack

Context Window Is the Bottleneck, and Framework Design Determines Burn Rate

Task-Relevant Subgraph

What Makes a Stack More Agent-Ready

Explicit Over Implicit Is Not Just a Clojure Virtue

Teaching Clojure programming class

The advertisement

The ways of teaching

Lessions learned from teaching

Switching between purpose view and implementation view

Different levels of complexity

Final notes

Using Datomic with disk cache and LU cache

The background of this post

Analytical queries need pre-computation

Using Datomic as key-value store

OutOfMemory Error occurred

Extract not-join out of query and use LU-cached memoize

A Clojurian's idioms and patterns for ETL

Background

Problems

Solution for schema validation

Solution for transformation complexity — let over map merge

Solution for database upsert semantic

Compound primary key

Cardinality many

Conclusions

Lessons learned from the software consulting job

Technical stack issues

A very simple user login is good enough for a small group of users

Revenue spreading problem

CI/CD issues

Evangelism of Clojure

Using datomic with Luminus: Where to put our queries?

Where to put the db queries if we use db option as +datomic?

Where to put the db transactions?

Conclusion

Clojure development environment by Vagrant

Certain part of vagrantfile you may need to remove.

The beginning of this repo

development and deployment

Some nice tools I cannot live without

Using Datomic in my app

The background of this post

Query API and Pull API are enough

Occasionally, a generalized CAS (compare-and-swap) is needed, or you need to use stamp.

DB Enumeration

How to pull out all the enumerations of the same type?

How to store the external string and enumeration mapping in Datomic?

REPL tips

沒有善用 clojure.repl/pprint

沒有善用 *1 *2

dependency injection with Clojure

groupby

pattern

patterns = programming with abstactions that are not powerful enough

golang 的兩個版本

clojure 的兩個版本

抽象層次的差異

沒有善用 `clojure.repl/pprint`

沒有善用 `1` `2`