Welcome to mtitek.com
This website is a place where I share my personal notes on some programming languages, frameworks, applications and tools.
Hopefully you find something interesting for you. I wish you a good reading. Enjoy :) Software Requirements Specification
Performance (Response Time, Throughput, and Scalability)
Modules integration (SOA/Enterprise Integration)
Testing (Continuous Integration)
Enterprise Microservices: What it takes? Please note that this is page is under construction ...
In this page, I will share some thoughts on how to create enterprise applications that meet the requirements listed above
by leveraging microservices architecture and using a workflow engine and messaging system
to manage and automate the services execution.
I will also present some frameworks and tools that are related to the continuous integration, deployment, and orchestration of services.
The idea is to be able to manage the complexity of mircoservices and act on the flow of execution of those services.
We want also to be able to audit and monitor the different components of the application and produce reports to better manage the performance issues
and to identify bottlenecks and spot problems and errors.
The sample design (above) uses an entry service that read data (csv, xml, records, ...) from external sources (file system, rdbms, ...).
This entry point of the application produces a payload data that will be stored in a staging storage (HDFS).
It will also create a payload message that will be used as an entry point to start the workflow engine (jBPM).
The payload message will also be used as a mean to communicate information between the tasks of the flow and the services.
Each time an entity will act on the payload data, it will produce a new payload message and send it to a messaging system (Kafka).
First the application server will read the payload message from the messaging system and trigger the workflow engine to start the services workflow.
The tasks of the workflow should implement simple business logic and should be responsible only for adding custom parameters to the payload message.
Each task of the workflow should submit the payload message to a specific topic of the messaging system.
An administrator (or an automatic task) should be able to resume the workflow and retry a specific task if it the corresponding service failed to finish its execution.
The services should subscribe to their specific topics and read messages from the messaging system.
Each service should implement the business logic specific to the executed task.
If the execution is successful, the service should add new parameters to the payload message and submit it to the main topic of the application.
The new payload message should be read by the application server and resume the worflow to execute the next task.
If the service fails to execute the task, it should send a new payload message to a specific retry topic.
The new payload message should include new parameters (like the retry number).
If the retry number reaches the maximum retries, a new payload message can be sent to an error topic that can be handled later by an administrator.
The services should always check their retry topics for existing messages and handled those first before handling the messages from their main topics.
The services should handle the number of failed retries and implement delay strategies that imposes a delay before reading messages from the retry topic. Microservices Workflow Automation: jBPM
Using a workflow engine to automate the microservices workflow execution
makes the flow of execution of microservices easy to understand and allows identifying any bottlenecks in the whole system.
It also allows an administrator (or an automatic backend service) to resume or cancel the workflow if it was blocked in a specific task.
In a complex enterprise application, the number of microservices can be huge
and any attempt to design one single process model for all microservices will makes the process very difficult to design, understand, and maintain.
It is normal to imagine that an enterprise application is composed of multiple business domains
that are decoupled from each other but they still need to interact and communicate in order to complete cross domain tasks.
Each business domain will need to have its own workflow and it will be responsible of orchestrating the services that it owns.
It's also not always easy and possible to identify one single module that will act as the owner or the main entry for all the modules.
In other words, it's difficult to design a main workflow that will orchestrate the flow of execution of all modules.
This doesn't mean that we can't design a reasonable number of workflows for the main business activities.
We just need to consider that in some cases the integration between different modules might uses other approaches to communicate with each other,
i.e. using rest apis, messaging systems, ...
See this page for more information and code samples of jBPM: Java Business Process Model (jBPM) Messaging System: Kafka
Using a messaging system to manage communication between tasks of the flow and services helps in decoupling the communication between them.
It also allows adding new instances of services without the need to apply any additional configuration.
A service need to subscribe to a specific topic and start reading messages from that topic and execute the related task.
The load on services is automatically balanced on the different instances
and each instance will execute tasks based on the power of the machine where it's installed (RAM, CPU, Disk, Network, ...).
A service should be responsible for notifying the failure of execution of a task
and it should implement strategies to manage failed tasks and retry its execution if it's required.
The payload message should include clear metadata so the services can interpret its and take the adequate decision.
It's expected that a service might consume a message but will fail to notify if the task was either successfully executed or it failed to execute it.
Such situations is difficult to manage. A service might successfully complete a task but fails to notify that.
The opposite scenario can also happen, a service might fail to complete a task but also fails to notify that.
Another scenario, the service timeout or just dies and there's no way to get any feedback of the state of the current executed task by the service.
The workflow engine makes it easy to track and monitor the state of the execution of tasks
and it's possible to resume or cancel a task based on some defined strategies (specific events, timeout, ...).
An administrator can use a dashboard and monitor the processes and act on them if needed.
We can implement a Monitor service that also can act on the executed tasks using the same or different strategies.
See this page for more information and installation steps of Kafka: Install and configure Apache Kafka Services Configuration and Registration: ZooKeeper
Using ZooKeeper as a persistence storage allows distributed services to share common configuration.
ZooKeper can be used by services for coordination, registration and discovery, and leader election.
It can be used to share the public keys of microservices that will be used to validate JWT tokens when handling the microservices requests.
See this page for more information and code samples of ZooKeeper: Apache ZooKeeper Securing Service to Service Communication: JWT
I will consider two types of communications between services: external communication and internal communication.
The first type of communication requires an authentication from an end user (or an external service).
In this case it's necessary to verify that all requests are authenticated and authorized before executing the related tasks.
It's also important to ensure that sensitive data is encrypted.
The second type of communication happens between internal services
(let's assume they are all backend services setting "safely" behind a firewall,
which often is the statement if one wonders why backend services are not secure).
In most cases we don't need internal services to authenticate when communicating with other internal services
but we must validate that services are authorized before executing their requests
and we must validate that the request really comes from the source as claimed.
Identifying the type of communication is important to decide the requirements (authentication, authorization, encryption)
and the solutions to secure the communication between services.
JWT (JSON Web Token) provides the infrastructure that services can use to securely transmit information.
The payload data of the token can hold the information required to identify the issuer and validate the request
(in general the payload data should not contain sensitive information).
The token has a signature that is used to verify the token and ensure the integrity of its data.
The issue here is that the token, if intercepted, can be used by anyone and can be sent and get verified by the service.
If needed the payload data can be encrypted but the encryption requires a mechanism to safely share a secret key
(which is not always an easy task when managing a very large distributed services).
Using ZooKeeper as a centralized configuration storage might alleviate some of these difficulties
by allowing the services to use a secure place where they can fetch both the secret keys and the public keys of each service.
The payload message will contains the JWT token that services can use to verify requests.
In the case where external end users are interacting with services through an API gateway,
the use of OAUTH is important to make sure the requests are authenticated and authorized.
The API gateway need to manage the token (OAUTH token) to verify requests from the end user
and it will be responsible for creating an internal JWT token that will be used to communicate with internal services.
The OAUTH token is an access token that shouldn't hold any sensitive information and should only point to the authorization server.
The internal token might contain all the needed information and verifying the token should not requires any communication with the authorization server.
To enforce the security of the JWT token, the services might add additional information to the token,
like the identity of the issuer, the targeted services, the expiration date. Services Integration: Camel
Camel reduces the time and effort in order to integrate services consuming or producing data.
It uses a Java Domain Specific Language (and XML configuration) to configure the routing rules that drive messages from one service (end point) to another.
It provides a lot of ready to use components that ease the integration with many systems (RDBMS, messaging systems, ...). Search: Solr
Using Solr to index logs generated by microservices
helps developers and administrators to get a centralized place where they can search for specific errors
without having to connect and login to specific machines and search in the log files.
Solr can also be used to index all kind of events produced by microservices.
It can also be used to index specific messages in specific topics in Kafka.
See this page for more information and code samples of Solr: Apache Solr Containerisation: Docker
Containerisation brings a lot of benefits to microservices development and deployment.
It enforces a consistent way of delivering components and ensure that services are portable across platforms and environments.
It makes the installation of services easier and ensures that resources allocated to each service is respected.
It also allows a better orchestration of the microservices and provide a better mean to scale and deploy services as needed.
Container Orchestration: Kubernetes
Managing containers is no easy task especially when dealing with a large number of containers.
Container orchestration brings solutions to containers managements issues
and makes deployment, resources management, availability , scaling of containers easier.