How to organize the software in classes and modules (aka packages, namespaces)?

For a long time, I have been intrigued by the “rules” that guide someone in properly organizing the software code into modules (aka packages, namespaces) and classes. I imagine that after applying those “rules”, one should obtain the same output (i.e. packages/modules/namespaces and classes) given the input (i.e. software requirements) is the same, like in mathematics. There are hints here and there, but I didn’t find anything convincing & complete in this regard so I decided to figure it out for myself; I describe my ideas in the following sections and I hope you’ll find it useful.

In my investigation, I found useful the following books:

  • Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides is an inspiring and practical book
  • Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans is inspiring and important for my conclusion
  • Patterns of Enterprise Application Architecture by Martin Fowler has many useful sections even though some are outdated
  • Microservices Patterns by Chris Richardson is an inspiring and practical book

1. Problem

The modules (aka, packages or namespaces) and the classes are the main building blocks when writing software code. Suppose the common case is for each class to be written in one file and a module is a directory containing such files. When working on complex software it might be difficult to determine the classes to create and how to group them into modules. A particular focus type is necessary to avoid the temptation posed by frameworks or technical aspects, to organize the software the wrong way.

PS: for a scripting language, e.g. JavaScript, the equivalent of a class could be a set of highly cohesive functions, placed in the same file

2. Layers and Modules

From the start, the most promising idea was to translate the layers, an application might have, into modules. Besides the books above, see also these excellent articles:

  1. https://alistair.cockburn.us/hexagonal-architecture/
  2. https://blog.cleancoder.com/uncle-bob/2012/08/13/the-clean-architecture.html
  3. https://jeffreypalermo.com/2008/07/the-onion-architecture-part-1/

After reading at least these articles it should become obvious that it’s not easy to “translate” the layers to modules because the module’s structure looks like a (file system) tree while the layers are like a horizontal set of lanes or some concentric circles. Another problem is with the layers themselves: which one should be used?

3. The Border

border

One thing that is easy to spot is that an application (e.g. desktop, microservice, etc.) has a border; in this regard, I find Hexagonal architecture especially useful. The border is represented by all the application’s external interfaces plus the adapters through which it accesses the external services (exposed through their external interfaces).

Example of external interface implementations:

  • RESTful endpoint handlers
  • queue message listeners
  • WebSocket message listeners

Everything that asks for something from an application requires an external interface to talk to.

Example of adapters:

  • DAO (i.e. Data Access Object, implements the CRUD operations)
  • Lucene index reader/writer (a kind of a NoSQL database)
  • (messages) stream reader/writer (e.g. Kafka topic reader/writer)
  • file system reader/writer
  • command line reader/writer
  • WebSocket message publisher
  • RESTful client

Everyone the application asks to do something is accessible through an adapter.

The external interface implementations are named (primary) adapters in the Hexagonal Architecture. Still, I prefer the external interface name because it seems to me to convey better that the application exposes/offers something to an external system while an adapter sounds to me like something used by the application to access an external system (this is the meaning I use for it).

The border maps to the Interface Adapters ring in the Clean Architecture; what I name external interface has nothing to do with the External Interfaces segment in the Frameworks and Drivers layer of the Clean Architecture! I suspect what that segment means, but because the author doesn’t detail it in his article, I won’t do it either. The name external interface is supposed to convey the idea of an application interface for the outside world; another name could be remote interface but external interface seems to fit better because even a process-to-process interface would be an external interface though not a remote one (i.e. one allowing access from a remote machine).

Concerning Onion Architecture it maps to the ring on the outer edge – the infrastructure segment of that ring inspired me to name infrastructure the module keeping the border classes (see it later, below).

But why is the border necessary in the first place?

Because the remaining part of the application, let’s call it the body, uses a “language” the application clients (e.g. systems, user agents) don’t know. The border “translates” the external requests into the body “language”; the body doesn’t care about the “languages” the border must understand, its job is to implement the application features and that’s enough.

For example, some external systems might use a queue to access the application features, in which case the external interface implementation will be a message listener, whilst others might use a RESTful endpoint, in which case the external interface implementation will be a RESTful handler. The external interface implementations will “talk” differently with the external systems but in the same way with the body.

The difference between the border and the body might be incredibly blurry when using a framework or some particular technology. For example, a framework could directly use the body based only on annotations, hence one could legitimately wonder where the external interface is; well, it’s still there though completely implemented by the framework. Change the application to provide e.g. an additional communication channel to the same body, one the framework doesn’t support, and the external interface will become visible.

4. From Border to Modules

Let’s determine the “rule” for module creation based on the examples below.

4.1. Examples

a. Application exposing RESTful endpoints and persisting to a database

In this situation, I like to have in the application root these modules (directories):

  • datasource (e.g. SQL/NoSQL DAO classes)
  • rest (RESTful handlers and clients)
    • if crowded, I might split its content into in (for handlers) and out (for clients) modules

b. Application exposing RESTful endpoints, persisting to a database and using a messaging system (e.g. Rabbit MQ)

In this situation, I like to have in the application’s root these modules (directories):

  • datasource
  • queue (message listeners and producers)
    • I might split its content into in (for listeners) and out (for producers) modules
  • rest

c. A complex application providing a lot of external interfaces and using a lot of adapters:

In this situation, I like to have in the application’s root this structure:

  • datasource
    • cache (in-memory database used for caching but not distributed locking)
    • dao (DAO classes for Sql or NoSql databases)
    • index (Lucene index reader/writer)
    • fs (file system, e.g. for loading/writing data from/to CSV, XML, etc)
  • infrastructure (or io or adapter)
    • mem (in-memory database used for other than caching, e.g. for distributed locking)
    • queue
    • rest
    • shell (command line reader/writer)
    • stream (Kafka topic consumer/publisher)
    • websocket (message listeners/publishers)
    • mvc (for MVC controllers)

One might think that adapter is a better name than infrastructure and I would agree if only adapters would be present there; however, I intend to add some things that Clean Architecture put in Frameworks and Drivers layer, e.g. classes that allocate system resources, e.g. thread pools (see it later).

d. UI Applications

UI applications use fewer external interface types than a backend application, usually RESTful and WebSocket. Modules structure:

  • rest
  • websocket

If there are many rest/websocket submodules, I group them by their target, e.g. the microservice they belong to.

4.3. Conclusion

You can see that I prefer keeping the datasource module in the application root; it’s because usually there’s a lot of activity there otherwise I’d move it into the infrastructure module. For me, the “rule” is:

keep the most crowded external interface & adapter modules (EI&A) in the application root while everything else into the infrastructure module. If there are too many EI&A modules in the root (including the infrastructure module) then use the infrastructure module to keep everything (for me, more than 3 means too many).

Additional module structuring hints:

  • use in (for external interfaces) and out (for adapters) modules inside infrastructure or its sub-modules
  • use consumer and producer (instead of in/out) inside queue
  • use listener and client (instead of in/out) inside rest, stream and websocket
  • if not crowded, use the event module for websocket, stream and queue
  • name the modules closer to the technology, e.g. topic instead of stream for Kafka

5. The Border Model

border model

Messages (e.g., diamond, square figures) are exchanged between the exterior and the application body; they are DTO (Data Transfer Objects) coupled to a particular technology/communication channel type, hence usually incompatible with the body’s “vocabulary”. The border must “translate” the in-DTO (i.e. received ones) to the body “language” because the body is oblivious to the external systems. These DTO are parameters or results of the border class methods hence they should stay next to the class using them. If used by many classes (e.g. the queue and rest classes), they should be placed into a new infrastructure sub-module, named model or dto.

To communicate with the body, the border needs converters/factories to “translate” the in-DTO to core objects or the core objects to out-DTO passed to the external systems as responses or as intermediate requests. I always like to put the factory classes next to their product class because it’s easy to find them starting from the product class. But doing so, the factories creating core objects by taking in-DTO as parameters would couple the core to the body which is bad; the solution is to put them in the border model module. However, in practice, I often break this rule (if I’m allowed to) without suffering any bad consequences; still, this doesn’t work if the body were a library used in many projects (which usually isn’t the case).

6. The Body

body

In previous sections, we split an application into border and body. One could consider the border a set of doors through which messages pass between the external systems and the body. A message might reach the body through e.g. a queue (e.g. RabbitMQ) or a RESTfull endpoint but if the target feature is the same then the message type reaching the body must be the same; the alternative is for the body to “understand” each external system message, which would couple it to them without obtaining an advantage, but only a lot of headaches.

In the above picture, the diamond, square and circle message types represent the set of message types (aka “language”) exchanged between the border and the external systems (or users) or between the border and the body. As one can see the border is a polyglot, it “talks” all languages the external systems “talk” (e.g. RESTful or RabbitMQ “language”). The body on the other hand is not a polyglot, it “talks” its own “language” and the border must understand it! This might not be obvious, especially when using frameworks that automatically convert the external system messages to the body ones. This is fine as long as one doesn’t give up on the temptation of fitting the body “language” into the border “languages”.

From a technical point of view, the messages exchanged between the layers (e.g. border, body) have the role of DTO; if their type/shape/class differs between the layers, a conversion effort is necessary. Theoretically, there should be a large effort to “translate” the DTOs from one layer to another but in practice, the frameworks do it automatically hence the same DTO could traverse multiple layers.

Technically, all those arrows pointing to the body constitute the use cases list, i.e. what the application can do; however, the business domain might consider a set of them a distinct, single use case. The use cases list is very similar to a book’s contents page hence, easily identifying them is something desirable (I’ll talk more about this in the next section).

7. The body modules and classes

My approach is to create a manager module where I put the Manager classes in charge of the use cases; e.g. PlaylistManager could be a Manager class dealing with audio playlist management. The manager module is the application’s equivalent of a book’s contents page hence, no matter how trivial might be, the Manager class must not be skipped otherwise, it might become difficult to determine what the application can do.

I borrowed the name Manager from Martin Fowler’s book where a Manager class is the name for an Application Service and I kept the meaning, i.e. a Manager class is an Application Service. The Manager classes use the infrastructure (e.g. DAO, message publisher) and the application core (see more about it below) to implement the use cases.

The Manager classes orchestrate the activities performed by the adapters and the core; they won’t do business work but only decide who does what and delegate the job to someone else. The Managers accept requests from the external systems through the external interfaces and use the adapters to accomplish their purpose (e.g. read/store from/into DB, invoke a RESTful service, publish a WebSocket message).

Besides the border and managers, the remaining part of the application is the core (more about it later).

In Onion Architecture the core includes these Manager classes plus “my” notion of core; there, the Manager classes are named Application Services. Application Service is a good name because it nicely contrasts with another type of service, i.e. Domain Services (another ring in Onion Architecture), but I like the term manager as the module name and Manager as the class suffix because they are shorter.

The Manager classes map perfectly to the Use Cases ring from Clean Architecture. “my” core notion maps to the Entities ring though I’m not sure in what proportion; it’s because I’m not totally sure that the Entities layer includes the Domain Services too, but more about it later.

managers

The messages between the Managers and adapters are usually core objects. However, they could also be DTOs received from the border or even from the external systems and passing through the border; it is the adapter’s responsibility to understand them.

Be aware that DTO is a role; I name core object one used by the core but if passed between layers it’s a DTO too. There are also “pure” DTOs, e.g. a criteria object used to query some DB which might never reach the core but would go directly through the Manager to the DB adapter. In this context, if the Manager is doing nothing else but only to delegate to the adapter the temptation to skip it is huge. For PoC or small applications, one might give up on this temptation but, if it does so, and some Manager classes are missing from the manager module, later it’ll be hard to tell what the application does because only looking into the manager module (aka, the use cases list) won’t be enough.

Many Managers might use the same input or output message types/classes. In this situation, I create a module named model or dto inside the manager module to keep those types/classes. The model module should also contain the converter or factory classes that create the core-accepted message types, similar to the border model; it won’t contain the factory classes that have only core dependencies, those will stay in the core. As for the border model, I again break the rule by putting all core-object factories in the core because I like to have the factory next to its outcome class. Usually, I get away without harm, but, as for the border model, if the core is a library, the approach won’t work.

Examples of activities the Manager might orchestrate:

  • load an audio playlist from DB (adapter), sort it (core), remove the duplicates (core), then store it back to the DB (adapter)
  • check the town hall’s website for new building authorization documents, download them, extract their content, and index them with Lucene (no core activity here)
  • accept a payment transaction (Manager input parameter/message/DTO), load the financial actor profiles from the DB (adapter), compute the fees (core), update the transaction details (core), store them into the DB (adapter)

8. The Core

The core is the part of the application that deals only with the business it is supposed to solve but nothing else. If the business problem is about managing some audio playlist then the core would work only with concepts/notions/nouns regarding the audio playlist while excluding the rest, e.g.:

  • playlist (it has a name, a location and one playlist-entries object)
  • playlist-entries (is a collection of one or many playlist-entry objects)
  • playlist-entry (it has a title and a location, e.g. a file path or YouTube identifier)

The core for the above example won’t deal with e.g.:

  • persistence
  • presentation
  • messaging systems
  • caching systems
  • locking mechanisms

One might observe that the core is usually small compared to the rest of the application – that’s true, and Eric Evans points it out too in his book (DDD). The core might be overlooked completely if the application is small enough and/or a framework abstracting the infrastructure is used! For example, if the application is about extracting some data from the database and then sending it through a RESTful endpoint back to the user, then nothing might remain to do in the core (e.g. see Spring Data REST).

On the other hand, if the application is complex the core will contain Entities, Value Objects, and (Domain) Services (see DDD by Eric Evans). I see a Domain Services layer on top of the Entities layer, the latter containing the Value Objects; additionally, both layers could contain interfaces and abstract classes.

On behalf of the core I create the model or domain module while inside it I usually create these modules:

  • one module for each Entity type (each contains its Value Objects)
  • service (contains the Domain Services)

If the Service operation parameters or their return types use classes other than Entities or Value Objects then those classes can sit next to the Service in a dedicated service sub-module. The commonly used ones could sit into a sub-module of the service module named dto.

The same could happen for the Entities, in which case I create an entity module where I put all Entities; if not many, I put the commonly used Value Objects directly into the entities module otherwise into a vo sub-module of entities.

Let’s visualize a crowded model/domain structure:

  • model (or domain)
    • entity
      • entity1
      • entityN
      • vo
    • service
      • service1
      • serviceP
      • dto

8.1. The Relation with Onion Architecture

The Domain Services and the Domain Model from the Onion Architecture (named so only in part 1) map to what I call the Domain Services and Entities layers (both forming the core). The problem is with the Domain Services layer which according to Onion Architecture contains the repository interfaces (see https://jeffreypalermo.com/2008/07/the-onion-architecture-part-1/):

The first layer around the Domain Model is typically where we would find interfaces that provide object saving and retrieving behavior, called repository interfaces. The object saving behavior is not in the application core, however, because it typically involves a database.

From my point of view, the Domain Services layer should contain Services that do what Eric Evans says about them in his book:

When a significant process or transformation in the domain is not a natural responsibility of an ENTITY or VALUE OBJECT, add an operation to the model as a standalone interface declared as a SERVICE. Define the interface in terms of the language of the model and make sure the operation name is part of the UBIQUITOUS LANGUAGE. Make the SERVICE stateless.

The purpose of a (Domain) Service is to contain business logic (that’s why the Domain word) that won’t fit an Entity or Value Object; it’s about properly placing that type of business logic but not about interfaces shaping the interaction with some external system (e.g. repository interfaces are shaping the interaction with the DB). For example, if PlaylistEntries is a Value Object containing a collection of file paths, the operation addPlaylistEntries(ple1, …, pleN) that returns a new PlaylistEntries seems to fit into a PlaylistEntriesService instead of the PlaylistEntries class.

8.2. The Relation with Clean Architecture

The way I define core maps to the Entities layer in Clean Architecture. Although the Entities layer doesn’t explicitly include the (Domain) Services I would say that the author doesn’t exclude them either; here is the Clean Architecture definition for the Entities layer:

Entities encapsulate Enterprise wide business rules. An entity can be an object with methods, or it can be a set of data structures and functions. It doesn’t matter so long as the entities could be used by many different applications in the enterprise.

For me, a Domain Service is the Service Eric Evans talks about in his book (DDD) which seems to fit the Clean Architecture Entities layer. However, I prefer an additional layer, i.e. Domain Services, to differentiate between (Domain) Services and Entities.

For more about the Domain Services and the difference from Application Services see at least the chapter Service Layer, sections Kinds of “Business Logic” and Implementation Variations in the Patterns of Enterprise Application Architecture by Martin Fowler. Focus on the concept of domain logic compared to the application logic; unfortunately, M. Fowler doesn’t explicitly define the Domain Services, he only talks about the Application Services but combined with Eric Evans’ definition of Service it should be clear that a Domain Services layer sits between the Application Services and the Entities layer. See also the section SERVICES and the Isolated Domain Layer in DDD by Eric Evans, to understand why:

It can be harder to distinguish application SERVICES from domain SERVICES.

9. The Full Picture

layers

The things between the coloured lines are the layer names; I put the Application Services in a square only for graphical reasons, otherwise, it would be in between two lines too. I point out again that for me, the Domain Services and Entities (which include the Value Objects as defined by DDD) form the core.

I said I feel the way I view an application architecture is closer to the Clean Architecture but you might notice that I missed a layer, i.e. Frameworks and Drivers (besides adding Domain Services); from the point of view of the code written by a developer that layer is almost non-existent. Usually, the code I see sitting in it is for allocating system resources, e.g. thread pools, database connection pools, and registering objects with the DI (dependency injection) framework. However, in practice this kind of code doesn’t deserve a special module, it can stay in the infrastructure (I always put in it the system resource allocator classes) and sometimes in the rest of the modules. The last part might feel wrong, especially when thinking of core classes so let’s see an example.

Suppose a DI framework is used to create a Manager instance by providing it with various infrastructure dependencies (e.g. a DAO class). The code to wire the dependencies into the Manager is the “glue code” the Frameworks and Drivers section is talking about; when using e.g. Spring Framework the usual approach is to create a config module where to put @Configuration annotated classes implementing the “glue code” (i.e. @Bean annotated methods). Another approach is to focus more on the Factory Method and Abstract Factory patterns; the “glue code” creating the Manager is a Factory Method implementation! One could put the factory class (i.e. @Configuration annotated class) next to the Manager class; I find it logical to search for a factory class next to its outcome class instead of who knows where else (remember that I also talked before about this preference). The same could happen for the Domain Services or Entities layer; however, in practice, I try to isolate at least the Entities layer from all frameworks, including DI.

TBC