June 30, 2020
In my previous article, we walked through running a distributed Kafka Connect setup via Docker. In this article, I will be turning my attention to the Kinesis Kafka Connector and going through the steps to get it working in distributed mode and connecting to external services, e.g. an external Kafka cluster and The Kinesis Firehose service.
The overall intention of this is to go through how to connect a containerized application to external, non-containerized applications. We could just as easily replace the Kinesis Kafka Connector with something like the Elasticsearch Connector.
Read more...
June 08, 2020
Everywhere you look these days in the software development world, you'll find containers in use in a variety of situations and by a vast number of developers and companies. This article from Docker does a good job of explaining containers. However, containers go far beyond Docker, including Kubernetes, Podman, Apache Mesos, and Red Hat OpenShift Container Platform among others.
I'm planning on writing a series of articles that go through various stages of deploying Kafka Connect in a containerized environment. For this article, I plan on getting to the point of deploying a multi-node distributed connector using docker. I will use one source connector in standalone mode that will be used to populate a Kafka topic with data and I will deploy a sink connector in distributed mode to pull the data back out.
Later articles will explore deploying other sink connectors in distributed mode, including the Kafka-Kinesis Connector, via containers. For this article, I will be using Docker via Docker Compose. I am hoping to look more into Podman and attempt deployment via Kubernetes in future articles. I am by no means an expert in any container technology, but I can mostly get around using containers in Docker. So, this is a learning experience on multiple fronts for me.
Read more...
November 27, 2019
When working in AWS, AssumeRole allows you to have access to resources to which you might not normally have access. Some use cases for using AssumeRole is for cross-account access, or in my case, developing locally. Using AssumeRole lets me grant my local code permissions as per those provided by the role to which you're assuming, such as DynamoDB, RedShift, or S3 access.
When I first found myself needing to this, I found several tutorials in the AWS documentation that showcased several types of credential usages, such as federated credentials and temporary credentials. but I never found a similar tutorial for using AssumeRole. I assume the documentation exists, I just didn't find it in the time I had to work on my code. So, here's how I went about implementing AssumeRole in Java using version 1.11.622
.
Read more...
November 19, 2019
When I set out to customize the Kafka-Kinesis Connector to read in a YAML configuration for use by the connector, I decided to go with Kotlin data classes to hold that configuration. The initial versions of these classes aren’t as good as I think they could be, though. For example, all of these data classes make use of var
s rather than val
s. I’d much rather stick to val
s whenever possible due to the fact that they're read-only and cannot be changed once set.
The initial data classes also all use optional (nullable) types for every field. I’d prefer using non-optionals when I can and when I know that certain values should always be non-null, such as the topic or destination stream names. Even for the 'optional' items, like filters
, I'd still prefer to return an empty list rather than null
. These data classes were all created this way because it was the quickest way I could get it all to work together and, at the time, all I wanted was something that worked. So this is me coming back to see what I can do to make these classes better, if anything.
Read more...
October 27, 2019
One thing you should always strive to do is to enable encryption, wherever possible, even if your systems are locked away behind layers and layers of firewalls and other security protocols. Even if your systems are not public facing and don't have a public IP address, you should at the very least encrypt your communications. For this article, we will explore how we enable a Kafka connector to communicate with an SSL-encrypted Kafka cluster.
Read more...
October 20, 2019
When I started out writing the series of articles regarding Kafka Connect and the Kafka Kinesis Connector, I originally forked the one from AWSLabs and I used that fork for all my modifications and originally intended to use it for all of my articles. However, as I progressed through writing them, I realized I might want to submit some of those things for a pull request to the original repo. I didn't want to add the mess of the YAML and multiple destinations to their code. I didn't want to taint their codebase. So, I made a complete copy of that fork and put it into a new repository so that I could continue to do my articles and experimentations with multiple destinations, etc. I'll eventually yank all of my changes out of my fork and only submit the things I hope others would find useful, such as the config validations.
So if you're ever looking through my github repos and see the two Kafka Kinesis Connector repositories, there's the logic behind them. I apologize if it has caused any confusion.
October 12, 2019
One of the issues I have with creating a custom ConfigDef for the Kafka Kinesis Connector is having to deal with multiple, very long, method signatures for ConfigDef.define
. ConfigDef
provides an astonishing 16 define
methods as of version 0.11.0.2
(the version that's used by the project), with one that takes in a ConfigKey
that actually performs the work. All of the other methods eventually call this method, passing in a new ConfigKey
, which means the constructor for this class is also a monster.
Read more...
October 08, 2019
So, we've strengthened up our configuration validation logic for the Kafka Kinesis Connector but there's still more we can do to help those that want to use this plugin. Of course, the principles used in these articles should generally apply to most any Kafka Connector you create. The next step is to help recommend configuration values to our users. For example, we can validate that a user has given us a valid AWS region, e.g. us-east-1
, us-west-1
, etc. but the user may not know all the valid regions available. For this purpose, we can provide a recommender to help recommend values to the user by implementing the ConfigDef.Recommender
interface.
Read more...
October 02, 2019
As it currently stands, you can easily feed invalid property values to the Kafka Kinesis Firehose Connector, such as a batch size of 1000, when the maximum batch size allowed by Firehose is only 500. Thankfully, Kafka Connect provides a mechanism to help with validating properties. The REST API provides a validate endpoint at /connector-plugins/<CONNECTOR_CLASS_NAME>/config/validate
.
Read more...
September 28, 2019
We have several 3-node Kafka connect clusters that each process different topics. If you’ve read my first article regarding my project's customized version of the AWS Labs Kafka-Kinesis Connector, you’ve already read about the corner we backed ourselves into with our first deployments of the modified code. We’ve been working, when possible, to fix those mistakes and to make updates to the clusters and their configurations easy. Well, at least easier than the abomination with which we started.
During one of the updates to the code, updates that added additional properties and an additional configuration file, we ran into an issue with one of the clusters. We were attempting a rolling update of the Kafka-Kinesis Connector plugin and while it went smoothly for the first two clusters, it was a huge pain to get this particular cluster to update appropriately.
Read more...
September 26, 2019
In a previous post, we made modifications to the Kafka-Kinesis Connector that allowed us to use multiple topics and multiple destinations with a single connector task. This time, we’ll go a bit further and add more functionality to the connector. What if we later ended up with a requirement that some of the data going through a topic had to be filtered out into additional destination firehoses?
Read more...
September 22, 2019
Prior to the work that eventually led me to write this article, and the other Kafka-related ones that will follow, I had very little exposure to Kafka and its ecosystem. I knew, more-or-less, what Kafka was and could do the basics to get a broker started and then run producers and consumers from the command line. That was the extent of my working with Kafka until I inherited the maintaining and deployment of Kafka Connect clusters at work.
These clusters all forward data to AWS Kinesis Firehoses, which then stream the data to Redshift, Splunk, Elasticsearch for Kibana dashbboards, or another Big Data store. The connector used by all of these clusters is a variation of this one from AWSLabs.
I say variation due to the fact that we had to modify it in order for it to suit our requirements. Rather than run one connector per topic/firehose combination, we needed it to take in multiple topics and then deliver to the related Kinesis Data Firehose. For each Kafka topic, there is at least one Kinesis Data Firehose. Due to time constraints, this ended up being a very painful process, and the evolution of that code is what will be explored here, at least as far as it can be.
Read more...
September 15, 2019
When I started this site back in 2016, I went with WordPress because it was the quickest solution I could find to get things off the ground. However, as time has progressed, it has become clear that it's a bit overkill for my needs. I don't plan on ever having users sign up for accounts, for example. It's just me and the stuff I write. Could that change? Possibly, but not likely. Meanwhile, static site generators, such as Gatsby, Sphinx, Jekyll, VuePress, and a trove of others have been gaining in popularity. That's when it hit me--all I really need is a basic static-type site.
While I really wanted to go with something like VuePress or Gatsby so that I could sharpen my now archaic JavaScript skills, I decided to go with JBake. JBake has features that were more familiar to me, including Freemarker and Groovy based templates and even has a Gradle plugin so that I could create a build script to help maintain the new version of the blog. Other offerings have similar features, but I've really missed working with Groovy and Gradle. These days, for better or worse, I'm mostly working with Python.
Read more...
April 08, 2019
That's about the plainest way I can say it. If you feel yourself getting burned out, stop. Just stop. Between my job and my attempts at side projects, I ended up severely burned out. Honestly, I still don't feel like I've recovered, but I can't just stop working on the things I enjoy while my job has tried to move me away from Java.
So, I may still be gone a while longer, but I'm going to try and get back into the swing of things and hopefully I'll be posting again soon. Starting small and just moving from there.
Always take care of yourself.
September 28, 2018
Now that we've gone through and updated carmix-creator
with all of the possible Java 7 updates, we can finally start looking at Java 8 and how we can make use of some of the constructs and other updates that Java 8 has to offer. For starters, we will see where we can make use of the biggest advancement to come to the language with the Java 8 release--lambda expressions.
What is a Lambda Expression
A lambda expression is a representation of an anonymous function comprised of a set of parameters, the lambda operator (->) and a function body, which is optionally surrounded by curly braces. Prior to Java 8, you might have used an anonymous class to handle this work, but lambda expressions help make your code less verbose and they can help make your intent more clear.
Read more...
September 25, 2018
If you've ever found yourself in need of validating nested Grails command objects, you've likely done something similar to the below snippet:
static constraints = {
profile validator: { val, obj -> val.validate() }
}
This works well if you want to prevent someone from submitting a form without filling out the required profile data, however, the default error message leaves a lot to be desired.
Read more...
September 21, 2018
As I continue to refactor and refine my carmix-collector project, I've come across a method that I really don't like. It looks too busy and depends on two separate parameters to determine what work should be done inside the method. Here's the method as it currently exists:
private static boolean verifyFile(Path aPath, String indFileDir, String indReadWrite) throws IOException {
if (!Files.exists(aPath)) {
throw new IOException("File Verification: " + aPath.getFileName() + " does not exist");
}
switch (indFileDir) {
case FILE_TYPE:
if (!Files.isRegularFile(aPath)) {
throw new IOException("File Verification: " + aPath.getFileName() + " does not exist");
}
break;
case DIR_TYPE:
if (!Files.isDirectory(aPath)) {
throw new IOException("Directory Verification: " + aPath.getFileName() + " does not exist");
}
break;
}
switch (indReadWrite) {
case READ_FILE:
if (!Files.isReadable(aPath)) {
throw new IOException("File Verification: Cannot read file " + aPath.getFileName());
}
break;
case WRITE_FILE:
if (!Files.isWritable(aPath)) {
throw new IOException("File Verification: Cannot write to file " + aPath.getFileName());
}
break;
}
return true;
}
What we should do is create a new method for each possible parameter value. We should use the Replace Parameter With Explicit Methods refactoring.
Read more...
August 17, 2018
In a previous post, I upgraded my Ripplr application, which was using Grails 3.1.9 to version 3.3.6. One of the changes that have come with Grails 3.3 is the new Grails Testing Support framework, which makes heavy use of Groovy traits as opposed to the previous Test Mixin framework, which relied heavily on annotations and AST transformations.
Initially, I was not looking forward to updating test frameworks. However, during my first few refactorings, it has proven to be mostly worry free. For the most part, you're replacing Annotations with trait implementations. You may have to make some other small tweaks, but so far they have been quick to implement.
Read more...
August 10, 2018
In a previous post, I wrote a unit test using JUnit 4.12 that unfortunately made use of what I believed to be unneeded uses of Mockito and PowerMockito. These tests were written with an earlier Junit 4 mindset, a mindset that was unaware of JUnit Rules. Now that I've had some time to look around some, I'm going to rewrite the tests making use of the TemporaryFolder Rule.
The TemporaryFolder
rule allows us to create folders and files that are deleted after a test is completed. I realize this is something that I could have likely done without the Rule and even prior to my mocking, but using the Rule makes it far easier to work with and performs the cleanup seamlessly.
All of the changes made are very similar and involve setting up a temporary file and writing to the file and then using it for the current test.
Read more...
July 16, 2018
I had intended to write an article on upgrading to Grails 3.3.5, but I ran into an issue that prevented me from completing the upgrade. Luckily, this issue was fixed in the 3.3.6 release.
I will walk through the steps I performed to upgrade my Grails 3.1.9 application, Ripplr to version 3.3.6. These steps will likely work for any 3.1.x to 3.3.x upgrade, but I have only successfully tested upgrading with this specific combination.
Here are the steps I performed:
- Update build.gradle
- Update GORM Version
- Update to Hibernate 5
- Add Grails test mixins dependency for tests that are now using a legacy framework
- Add new dependencies
- Update plugin versions, if at all possible\
- Update application.yml and application.groovy
- Update resources.groovy
- Move Bootstrap.groovy and UrlMappings.groovy out of the default package
- Update Gradle Wrapper
- Add Grails wrapper
- Update Logging Configuration
- Fix broken unit tests (or Ignore them)
- Fix broken integration tests (or Ignore them)
- Verify CORS still works
- Fix Hibernate startup errors that occurred after updates
Read more...
July 09, 2018
Testing your code is an important step in the software development lifecycle and should be done as early as possible. Testing, in general, helps us catch mistakes and bugs in our code. I once worked on a project that had very little tests in place and the majority of those tests failed because your database is different from the test database. But, it also had some very important tests that would check Spring configurations and ensure everything was wired up correctly. These integration tests helped catch a lot of wiring issues quickly and helped us be more aware of how we updated the configurations as we updated the code. Tests, no matter if we're talking about unit, integration, functional, stress, or any other form of testing, help us find problems with the systems we write and gives us confidence that the code we write will work as expected.
Read more...
June 05, 2018
In a previous post, I began refactoring the carmix-collector project. While its function is fairly simple, it had one class trying to do all the work which, among other things, makes the code terribly hard to test. In this post, I will be doing another refactoring that might take a minute to work out, but we will back up and try something new, if needed.
I start by creating a new playlist processor class, M3UPlaylistProcessor
. Its purpose will be to process the selected playlist file. Inside the new class, I create a new method, process
that will contain the body of the existing processPlaylistPath
method from the GUI class. I think naming the method process
is a bit better than processPlaylistPath
since the class name makes 'Playlist' redundant and the Path
method parameter makes it clear we are processing a Path.
If you try and compile this code (or are using an IDE), you'll notice an error: Cannot resolve method 'processTrackURL'
. You'll probably also notice it doesn't recognize the Status enum either or the Strings used to denote the M3U header and info sections. Seems simple enough to just copy the processTrackURL method and other missing items from the GUI class and keep going.
Read more...
May 29, 2018
Refactoring is an important step in the software development process. It should be practiced early and often to help maintain and improve code quality over time. Of course, we're all guilty of putting this off when deadlines and management expectations force us to choose between just having something working on time or improving upon your existing code while writing new functionality. If we're lucky, we're able to go back and refactor the code before the project is over, but it must be done with great care to ensure that we do not break any of the existing functionality. This is where it would be important to have an existing test suite to verify everything runs as expected before and after refactoring.
Read more...
May 18, 2018
At my (now former) day job, I work on a monolithic java web application. CORS, cross-origin resource sharing, is not something that is a big concern for that application since, by default, a script can make a request to the same server from which it originates. However, modern web applications tend to have incoming connections from a variety of other applications, including mobile applications or a single page application running on a separate server. When working with these applications or with your own applications that need to connect to different servers, CORS is something that developers definitely need to keep in mind.
Read more...
May 10, 2018
The past few posts have been going over some of the improvements that came with updating to Java 7. This article will showcase how to use strings in a switch statement using the carmix-creator project.
Prior to Java 7, you had to use constants of type Byte
, Character
, Short
, Integer
constants (or their primitive equivalents) or enum
s as values for the cases in a switch
statement. Personally, I think this is the way to go, but it isn't always possible and I'm glad they made the change. The majority of my past projects made minimal use of enum
or Integer
/int
constants. They relied heavily on Strings, so this change would help make an endless sea of if statements look a little bit better. Even in my own code, I've only started to embrace enums, but it'll be a while before I get to writing about all that.
Read more...
May 08, 2018
Ripplr is a Twitter-like application that I originally developed as I followed along in the Grails in Action books from Manning Publications. It was originally written using Grails 2 and I updated it to use Grails 3.1.9 a while ago. It's not the prettiest looking site in the world, but it makes use of Bootstrap 3.3, some jQuery, and has basic follow & blocking functionality, so it has a decent foundation with which to start. It's a bit messy all around, but I'm hoping to be able to use it for this blog a bit and clean it up along the way.
There are a few Ripplr-related projects as well. There's an Angular user interface that I've written separately from the main application that I will use from time to time to help test out ideas. There is likely going to be a pure Java version of Ripplr as well. I have a few things I'd like to write about using Java web applications, so I could probably make use of Ripplr here.
Feel free to check it out on GitHub and keep checking back regularly to see its progress.
May 03, 2018
In a previous post, we updated legacy Java 5 code to use try-with-resources. In this post, we will be updating that same legacy code to use the new to Java 7 NIO.2 enhancements.
Java's NIO.2 is a set of new classes and methods that predominantly live in the java.nio
package. It is intended to be a replacement of java.io.File
as the abstraction when dealing with code that reads or writes from a filesystem.
The starting point for using NIO.2 is the Path
. The Path
class contains various methods for interacting with the path and extracting information about it. For the carmix-creator application, we will replace all uses of the File
class and replace them with Path
and other new classes and methods from the java.nio.file
package.
Since carmix-creator copies files, it makes heavy use of several File
-related classes. For starters, the loadPlaylistFile
method can be refactored into a new method loadPlaylistPath
.
Read more...
April 26, 2018
So, Java 7 has been out for a quite a while now, but my corporate career had yet to expose me to anything newer than Java 6 at the time I originally wrote this. Even now, as of April 2018, I have yet to work with Java 8 in an enterprise setting. So, I took it upon myself at the time to learn some of the updates to the Java language in this version before starting to look at the changes in Java 8 (and now Java 9 and 10!).
Java 7 brought several key improvements to the language. Some of these improvements I will be covering include:
- Try-with-resources
- New File IO (NIO.2)
- The ability to use strings in a switch statement
- The ability to handle multiple exceptions in one catch block
For this article, we are highlighting try-with resources and how this change can be applied to pre-Java 7 code. To showcase these improvements, I will be using the carmix-creator. It has several areas that are in dire need of the Java 7 improvements.
Read more...
April 19, 2018
I will be using several projects of various sizes throughout my posts here. The first project I would like to introduce is one I've dubbed the 'carmix-collector'. It's a Java Swing application that was created using Netbeans and I wrote the original version of this code around 2006. The purpose of this application was to quickly copy songs from an MP3 Playlist (*.m3u) file so that they could be burned to a CD for use in my MP3-enabled car stereo. Needless to say, I haven't paid much attention to this code in years.
Read more...
April 11, 2018
For several years now, I have set out intending to start this blog all about Java and the surrounding ecosystem. I keep putting it off for one reason or another, ranging from feeling like I never enough time to fearing that I'm not good enough to keep up something like this.
So, here I am fully intending to give this all a try. Again. I have so many things I’d like to write about, which brings up the question of where I should start writing. I suppose the best place for me to start is the beginning. I have several articles I wrote for myself back when Java 7 came out. Of course, we are now up to Java 9 10, so these articles aren't as relevant as they may have been when I originally wrote them. I think I will keep the originals more or less intact. They deal with migrating code written using Java 5 and making use of new Java 7 constructs. I will take those changes and update the code to make use of Java 8, wherever possible and keep going to 9 and above to help keep up with the times.
This code will also need a lot of work in regards to refactoring, which will all come after moving through the newer versions of Java. I'll just go where the code takes me, I suppose.
I also like working with Groovy, Grails, Scala, and plenty of other languages and frameworks within the Java world and I hope to have some articles around my experiences with them and more.
Here’s to hoping I’ll keep things going this time around!
December 31, 2016
Hi everyone! My name is Joel and I have been a java developer for over 10 years now. Lately, I have become more and more interested with working with newer technologies and languages but find myself working on legacy software without a lot of chances to branch out into the expanded java world. Through this blog, I hope to explore newer technologies and help explain the baffling issues I come across along the way. I will also discuss the java language itself and explore topics that will hopefully be relevant to more than just myself.
I look forward to posting new topics and ideas!
Also, please excuse the mess as I continue configuring the site in a way I think looks best.
Older posts are available in the archive.