Running the Kinesis Kafka Connector Inside a Container

June 30, 2020

In my previous article, we walked through running a distributed Kafka Connect setup via Docker. In this article, I will be turning my attention to the Kinesis Kafka Connector and going through the steps to get it working in distributed mode and connecting to external services, e.g. an external Kafka cluster and The Kinesis Firehose service.

The overall intention of this is to go through how to connect a containerized application to external, non-containerized applications. We could just as easily replace the Kinesis Kafka Connector with something like the Elasticsearch Connector.

Read more...

Running a Kafka Connector Inside a Container (Docker)

June 08, 2020

Everywhere you look these days in the software development world, you'll find containers in use in a variety of situations and by a vast number of developers and companies. This article from Docker does a good job of explaining containers. However, containers go far beyond Docker, including Kubernetes, Podman, Apache Mesos, and Red Hat OpenShift Container Platform among others.

I'm planning on writing a series of articles that go through various stages of deploying Kafka Connect in a containerized environment. For this article, I plan on getting to the point of deploying a multi-node distributed connector using docker. I will use one source connector in standalone mode that will be used to populate a Kafka topic with data and I will deploy a sink connector in distributed mode to pull the data back out.

Later articles will explore deploying other sink connectors in distributed mode, including the Kafka-Kinesis Connector, via containers. For this article, I will be using Docker via Docker Compose. I am hoping to look more into Podman and attempt deployment via Kubernetes in future articles. I am by no means an expert in any container technology, but I can mostly get around using containers in Docker. So, this is a learning experience on multiple fronts for me.

Read more...

Using AssumeRole with the AWS Java SDK

November 27, 2019

When working in AWS, AssumeRole allows you to have access to resources to which you might not normally have access. Some use cases for using AssumeRole is for cross-account access, or in my case, developing locally. Using AssumeRole lets me grant my local code permissions as per those provided by the role to which you're assuming, such as DynamoDB, RedShift, or S3 access.

When I first found myself needing to this, I found several tutorials in the AWS documentation that showcased several types of credential usages, such as federated credentials and temporary credentials. but I never found a similar tutorial for using AssumeRole. I assume the documentation exists, I just didn't find it in the time I had to work on my code. So, here's how I went about implementing AssumeRole in Java using version 1.11.622.

Read more...

Parsing YAML Using Kotlin Objects

November 19, 2019

When I set out to customize the Kafka-Kinesis Connector to read in a YAML configuration for use by the connector, I decided to go with Kotlin data classes to hold that configuration. The initial versions of these classes aren’t as good as I think they could be, though. For example, all of these data classes make use of vars rather than vals. I’d much rather stick to vals whenever possible due to the fact that they're read-only and cannot be changed once set.

The initial data classes also all use optional (nullable) types for every field. I’d prefer using non-optionals when I can and when I know that certain values should always be non-null, such as the topic or destination stream names. Even for the 'optional' items, like filters, I'd still prefer to return an empty list rather than null. These data classes were all created this way because it was the quickest way I could get it all to work together and, at the time, all I wanted was something that worked. So this is me coming back to see what I can do to make these classes better, if anything.

Read more...

Adding SSL Encryption Configuration to Kafka Connectors

October 27, 2019

One thing you should always strive to do is to enable encryption, wherever possible, even if your systems are locked away behind layers and layers of firewalls and other security protocols. Even if your systems are not public facing and don't have a public IP address, you should at the very least encrypt your communications. For this article, we will explore how we enable a Kafka connector to communicate with an SSL-encrypted Kafka cluster.

Read more...

A Tale of Two Git Repos

October 20, 2019

When I started out writing the series of articles regarding Kafka Connect and the Kafka Kinesis Connector, I originally forked the one from AWSLabs and I used that fork for all my modifications and originally intended to use it for all of my articles. However, as I progressed through writing them, I realized I might want to submit some of those things for a pull request to the original repo. I didn't want to add the mess of the YAML and multiple destinations to their code. I didn't want to taint their codebase. So, I made a complete copy of that fork and put it into a new repository so that I could continue to do my articles and experimentations with multiple destinations, etc. I'll eventually yank all of my changes out of my fork and only submit the things I hope others would find useful, such as the config validations.

So if you're ever looking through my github repos and see the two Kafka Kinesis Connector repositories, there's the logic behind them. I apologize if it has caused any confusion.

Creating a Builder for the Kafka Connect ConfigKey Class

October 12, 2019

One of the issues I have with creating a custom ConfigDef for the Kafka Kinesis Connector is having to deal with multiple, very long, method signatures for ConfigDef.define. ConfigDef provides an astonishing 16 define methods as of version 0.11.0.2 (the version that's used by the project), with one that takes in a ConfigKey that actually performs the work. All of the other methods eventually call this method, passing in a new ConfigKey, which means the constructor for this class is also a monster.

Read more...

Adding Configuration Recommendations to the Kafka Kinesis Connector

October 08, 2019

So, we've strengthened up our configuration validation logic for the Kafka Kinesis Connector but there's still more we can do to help those that want to use this plugin. Of course, the principles used in these articles should generally apply to most any Kafka Connector you create. The next step is to help recommend configuration values to our users. For example, we can validate that a user has given us a valid AWS region, e.g. us-east-1, us-west-1, etc. but the user may not know all the valid regions available. For this purpose, we can provide a recommender to help recommend values to the user by implementing the ConfigDef.Recommender interface.

Read more...

Adding Configuration Validation to the Kafka Kinesis Connector

October 02, 2019

As it currently stands, you can easily feed invalid property values to the Kafka Kinesis Firehose Connector, such as a batch size of 1000, when the maximum batch size allowed by Firehose is only 500. Thankfully, Kafka Connect provides a mechanism to help with validating properties. The REST API provides a validate endpoint at /connector-plugins/<CONNECTOR_CLASS_NAME>/config/validate.

Read more...

My Kafka Connect Woes When Updating the Kafka Kinesis Plugin

September 28, 2019

We have several 3-node Kafka connect clusters that each process different topics. If you’ve read my first article regarding my project's customized version of the AWS Labs Kafka-Kinesis Connector, you’ve already read about the corner we backed ourselves into with our first deployments of the modified code. We’ve been working, when possible, to fix those mistakes and to make updates to the clusters and their configurations easy. Well, at least easier than the abomination with which we started.

During one of the updates to the code, updates that added additional properties and an additional configuration file, we ran into an issue with one of the clusters. We were attempting a rolling update of the Kafka-Kinesis Connector plugin and while it went smoothly for the first two clusters, it was a huge pain to get this particular cluster to update appropriately.

Read more...

Adding Filtering to the Kafka Kinesis Connector

September 26, 2019

In a previous post, we made modifications to the Kafka-Kinesis Connector that allowed us to use multiple topics and multiple destinations with a single connector task. This time, we’ll go a bit further and add more functionality to the connector. What if we later ended up with a requirement that some of the data going through a topic had to be filtered out into additional destination firehoses?

Read more...

Customizing the Kafka Kinesis Connector

September 22, 2019

Prior to the work that eventually led me to write this article, and the other Kafka-related ones that will follow, I had very little exposure to Kafka and its ecosystem. I knew, more-or-less, what Kafka was and could do the basics to get a broker started and then run producers and consumers from the command line. That was the extent of my working with Kafka until I inherited the maintaining and deployment of Kafka Connect clusters at work.

These clusters all forward data to AWS Kinesis Firehoses, which then stream the data to Redshift, Splunk, Elasticsearch for Kibana dashbboards, or another Big Data store. The connector used by all of these clusters is a variation of this one from AWSLabs.

I say variation due to the fact that we had to modify it in order for it to suit our requirements. Rather than run one connector per topic/firehose combination, we needed it to take in multiple topics and then deliver to the related Kinesis Data Firehose. For each Kafka topic, there is at least one Kinesis Data Firehose. Due to time constraints, this ended up being a very painful process, and the evolution of that code is what will be explored here, at least as far as it can be.

Read more...

Migrating joelforjava.com from WordPress to JBake

September 15, 2019

When I started this site back in 2016, I went with WordPress because it was the quickest solution I could find to get things off the ground. However, as time has progressed, it has become clear that it's a bit overkill for my needs. I don't plan on ever having users sign up for accounts, for example. It's just me and the stuff I write. Could that change? Possibly, but not likely. Meanwhile, static site generators, such as Gatsby, Sphinx, Jekyll, VuePress, and a trove of others have been gaining in popularity. That's when it hit me--all I really need is a basic static-type site.

While I really wanted to go with something like VuePress or Gatsby so that I could sharpen my now archaic JavaScript skills, I decided to go with JBake. JBake has features that were more familiar to me, including Freemarker and Groovy based templates and even has a Gradle plugin so that I could create a build script to help maintain the new version of the blog. Other offerings have similar features, but I've really missed working with Groovy and Gradle. These days, for better or worse, I'm mostly working with Python.

Read more...

The Burnout is Real

April 08, 2019

That's about the plainest way I can say it. If you feel yourself getting burned out, stop. Just stop. Between my job and my attempts at side projects, I ended up severely burned out. Honestly, I still don't feel like I've recovered, but I can't just stop working on the things I enjoy while my job has tried to move me away from Java.

So, I may still be gone a while longer, but I'm going to try and get back into the swing of things and hopefully I'll be posting again soon. Starting small and just moving from there.

Always take care of yourself.

Updating Legacy Code: Lambdas I

September 28, 2018

Now that we've gone through and updated carmix-creator with all of the possible Java 7 updates, we can finally start looking at Java 8 and how we can make use of some of the constructs and other updates that Java 8 has to offer. For starters, we will see where we can make use of the biggest advancement to come to the language with the Java 8 release--lambda expressions.

What is a Lambda Expression

A lambda expression is a representation of an anonymous function comprised of a set of parameters, the lambda operator (->) and a function body, which is optionally surrounded by curly braces. Prior to Java 8, you might have used an anonymous class to handle this work, but lambda expressions help make your code less verbose and they can help make your intent more clear.

Read more...

Grails: Nested Validation in Command Objects

September 25, 2018

If you've ever found yourself in need of validating nested Grails command objects, you've likely done something similar to the below snippet:

static constraints = {
	profile validator: { val, obj -> val.validate() }
}

This works well if you want to prevent someone from submitting a form without filling out the required profile data, however, the default error message leaves a lot to be desired.

Read more...

Refactoring: Replace Parameter With Explicit Methods

September 21, 2018

As I continue to refactor and refine my carmix-collector project, I've come across a method that I really don't like. It looks too busy and depends on two separate parameters to determine what work should be done inside the method. Here's the method as it currently exists:

private static boolean verifyFile(Path aPath, String indFileDir, String indReadWrite) throws IOException {
    if (!Files.exists(aPath)) {
        throw new IOException("File Verification: " + aPath.getFileName() + " does not exist");
    }

    switch (indFileDir) {
        case FILE_TYPE:
            if (!Files.isRegularFile(aPath)) {
                throw new IOException("File Verification: " + aPath.getFileName() + " does not exist");
            }
            break;
        case DIR_TYPE:
            if (!Files.isDirectory(aPath)) {
                throw new IOException("Directory Verification: " + aPath.getFileName() + " does not exist");
            }
            break;
    }

    switch (indReadWrite) {
        case READ_FILE:
            if (!Files.isReadable(aPath)) {
                throw new IOException("File Verification: Cannot read file " + aPath.getFileName());
            }
            break;
        case WRITE_FILE:
            if (!Files.isWritable(aPath)) {
                throw new IOException("File Verification: Cannot write to file " + aPath.getFileName());
            }
            break;
        }
    return true;
}

What we should do is create a new method for each possible parameter value. We should use the Replace Parameter With Explicit Methods refactoring.

Read more...

Updating from Grails Test Mixin Framework to Grails Testing Support Framework

August 17, 2018

In a previous post, I upgraded my Ripplr application, which was using Grails 3.1.9 to version 3.3.6. One of the changes that have come with Grails 3.3 is the new Grails Testing Support framework, which makes heavy use of Groovy traits as opposed to the previous Test Mixin framework, which relied heavily on annotations and AST transformations.

Initially, I was not looking forward to updating test frameworks. However, during my first few refactorings, it has proven to be mostly worry free. For the most part, you're replacing Annotations with trait implementations. You may have to make some other small tweaks, but so far they have been quick to implement.

Read more...

Testing: Using JUnit Rule(s) to Reduce the Usage of Mocking Frameworks

August 10, 2018

In a previous post, I wrote a unit test using JUnit 4.12 that unfortunately made use of what I believed to be unneeded uses of Mockito and PowerMockito. These tests were written with an earlier Junit 4 mindset, a mindset that was unaware of JUnit Rules. Now that I've had some time to look around some, I'm going to rewrite the tests making use of the TemporaryFolder Rule.

The TemporaryFolder rule allows us to create folders and files that are deleted after a test is completed. I realize this is something that I could have likely done without the Rule and even prior to my mocking, but using the Rule makes it far easier to work with and performs the cleanup seamlessly.

All of the changes made are very similar and involve setting up a temporary file and writing to the file and then using it for the current test.

Read more...

Updating from Grails 3.1.9 to Grails 3.3.6

July 16, 2018

I had intended to write an article on upgrading to Grails 3.3.5, but I ran into an issue that prevented me from completing the upgrade. Luckily, this issue was fixed in the 3.3.6 release.

I will walk through the steps I performed to upgrade my Grails 3.1.9 application, Ripplr to version 3.3.6. These steps will likely work for any 3.1.x to 3.3.x upgrade, but I have only successfully tested upgrading with this specific combination.

Here are the steps I performed:

  1. Update build.gradle
    • Update GORM Version
    • Update to Hibernate 5
    • Add Grails test mixins dependency for tests that are now using a legacy framework
    • Add new dependencies
    • Update plugin versions, if at all possible\
  2. Update application.yml and application.groovy
  3. Update resources.groovy
  4. Move Bootstrap.groovy and UrlMappings.groovy out of the default package
  5. Update Gradle Wrapper
  6. Add Grails wrapper
  7. Update Logging Configuration
  8. Fix broken unit tests (or Ignore them)
  9. Fix broken integration tests (or Ignore them)
  10. Verify CORS still works
  11. Fix Hibernate startup errors that occurred after updates

Read more...

Testing: Unit Testing with Mockito and Powermock

July 09, 2018

Testing your code is an important step in the software development lifecycle and should be done as early as possible. Testing, in general, helps us catch mistakes and bugs in our code. I once worked on a project that had very little tests in place and the majority of those tests failed because your database is different from the test database. But, it also had some very important tests that would check Spring configurations and ensure everything was wired up correctly. These integration tests helped catch a lot of wiring issues quickly and helped us be more aware of how we updated the configurations as we updated the code. Tests, no matter if we're talking about unit, integration, functional, stress, or any other form of testing, help us find problems with the systems we write and gives us confidence that the code we write will work as expected.

Read more...

Refactoring: Extract Class (Part 2)

June 05, 2018

In a previous post, I began refactoring the carmix-collector project. While its function is fairly simple, it had one class trying to do all the work which, among other things, makes the code terribly hard to test. In this post, I will be doing another refactoring that might take a minute to work out, but we will back up and try something new, if needed.

I start by creating a new playlist processor class, M3UPlaylistProcessor. Its purpose will be to process the selected playlist file. Inside the new class, I create a new method, process that will contain the body of the existing processPlaylistPath method from the GUI class. I think naming the method process is a bit better than processPlaylistPath since the class name makes 'Playlist' redundant and the Path method parameter makes it clear we are processing a Path.

If you try and compile this code (or are using an IDE), you'll notice an error: Cannot resolve method 'processTrackURL'. You'll probably also notice it doesn't recognize the Status enum either or the Strings used to denote the M3U header and info sections. Seems simple enough to just copy the processTrackURL method and other missing items from the GUI class and keep going.

Read more...

Refactoring: Extract Class (Part 1)

May 29, 2018

Refactoring is an important step in the software development process. It should be practiced early and often to help maintain and improve code quality over time. Of course, we're all guilty of putting this off when deadlines and management expectations force us to choose between just having something working on time or improving upon your existing code while writing new functionality. If we're lucky, we're able to go back and refactor the code before the project is over, but it must be done with great care to ensure that we do not break any of the existing functionality. This is where it would be important to have an existing test suite to verify everything runs as expected before and after refactoring.

Read more...

CORS for Grails 3.1.9

May 18, 2018

At my (now former) day job, I work on a monolithic java web application. CORS, cross-origin resource sharing, is not something that is a big concern for that application since, by default, a script can make a request to the same server from which it originates. However, modern web applications tend to have incoming connections from a variety of other applications, including mobile applications or a single page application running on a separate server. When working with these applications or with your own applications that need to connect to different servers, CORS is something that developers definitely need to keep in mind.

Read more...

Updating Legacy Code, Part III: Switch with Strings

May 10, 2018

The past few posts have been going over some of the improvements that came with updating to Java 7. This article will showcase how to use strings in a switch statement using the carmix-creator project.

Prior to Java 7, you had to use constants of type Byte, Character, Short, Integer constants (or their primitive equivalents) or enums as values for the cases in a switch statement. Personally, I think this is the way to go, but it isn't always possible and I'm glad they made the change. The majority of my past projects made minimal use of enum or Integer/int constants. They relied heavily on Strings, so this change would help make an endless sea of if statements look a little bit better. Even in my own code, I've only started to embrace enums, but it'll be a while before I get to writing about all that.

Read more...

Introdicing the Ripplr Projects

May 08, 2018

Ripplr is a Twitter-like application that I originally developed as I followed along in the Grails in Action books from Manning Publications. It was originally written using Grails 2 and I updated it to use Grails 3.1.9 a while ago. It's not the prettiest looking site in the world, but it makes use of Bootstrap 3.3, some jQuery, and has basic follow & blocking functionality, so it has a decent foundation with which to start. It's a bit messy all around, but I'm hoping to be able to use it for this blog a bit and clean it up along the way.

There are a few Ripplr-related projects as well. There's an Angular user interface that I've written separately from the main application that I will use from time to time to help test out ideas. There is likely going to be a pure Java version of Ripplr as well. I have a few things I'd like to write about using Java web applications, so I could probably make use of Ripplr here.

Feel free to check it out on GitHub and keep checking back regularly to see its progress.

Updating Legacy Code, Part II: NIO.2

May 03, 2018

In a previous post, we updated legacy Java 5 code to use try-with-resources. In this post, we will be updating that same legacy code to use the new to Java 7 NIO.2 enhancements.

Java's NIO.2 is a set of new classes and methods that predominantly live in the java.nio package. It is intended to be a replacement of java.io.File as the abstraction when dealing with code that reads or writes from a filesystem.

The starting point for using NIO.2 is the Path. The Path class contains various methods for interacting with the path and extracting information about it. For the carmix-creator application, we will replace all uses of the File class and replace them with Path and other new classes and methods from the java.nio.file package.

Since carmix-creator copies files, it makes heavy use of several File-related classes. For starters, the loadPlaylistFile method can be refactored into a new method loadPlaylistPath.

Read more...

Updating Legacy Code, Part I: Try with Resources

April 26, 2018

So, Java 7 has been out for a quite a while now, but my corporate career had yet to expose me to anything newer than Java 6 at the time I originally wrote this. Even now, as of April 2018, I have yet to work with Java 8 in an enterprise setting. So, I took it upon myself at the time to learn some of the updates to the Java language in this version before starting to look at the changes in Java 8 (and now Java 9 and 10!).

Java 7 brought several key improvements to the language. Some of these improvements I will be covering include:

  • Try-with-resources
  • New File IO (NIO.2)
  • The ability to use strings in a switch statement
  • The ability to handle multiple exceptions in one catch block

For this article, we are highlighting try-with resources and how this change can be applied to pre-Java 7 code. To showcase these improvements, I will be using the carmix-creator. It has several areas that are in dire need of the Java 7 improvements.

Read more...

Introducing the 'carmix-collector' Project

April 19, 2018

I will be using several projects of various sizes throughout my posts here. The first project I would like to introduce is one I've dubbed the 'carmix-collector'. It's a Java Swing application that was created using Netbeans and I wrote the original version of this code around 2006. The purpose of this application was to quickly copy songs from an MP3 Playlist (*.m3u) file so that they could be burned to a CD for use in my MP3-enabled car stereo. Needless to say, I haven't paid much attention to this code in years.

Read more...

Giving This Another Try

April 11, 2018

For several years now, I have set out intending to start this blog all about Java and the surrounding ecosystem. I keep putting it off for one reason or another, ranging from feeling like I never enough time to fearing that I'm not good enough to keep up something like this.

So, here I am fully intending to give this all a try. Again. I have so many things I’d like to write about, which brings up the question of where I should start writing. I suppose the best place for me to start is the beginning. I have several articles I wrote for myself back when Java 7 came out. Of course, we are now up to Java 9 10, so these articles aren't as relevant as they may have been when I originally wrote them. I think I will keep the originals more or less intact. They deal with migrating code written using Java 5 and making use of new Java 7 constructs. I will take those changes and update the code to make use of Java 8, wherever possible and keep going to 9 and above to help keep up with the times.

This code will also need a lot of work in regards to refactoring, which will all come after moving through the newer versions of Java. I'll just go where the code takes me, I suppose.

I also like working with Groovy, Grails, Scala, and plenty of other languages and frameworks within the Java world and I hope to have some articles around my experiences with them and more.

Here’s to hoping I’ll keep things going this time around!

Hello World!

December 31, 2016

Hi everyone! My name is Joel and I have been a java developer for over 10 years now. Lately, I have become more and more interested with working with newer technologies and languages but find myself working on legacy software without a lot of chances to branch out into the expanded java world. Through this blog, I hope to explore newer technologies and help explain the baffling issues I come across along the way. I will also discuss the java language itself and explore topics that will hopefully be relevant to more than just myself.

I look forward to posting new topics and ideas!

Also, please excuse the mess as I continue configuring the site in a way I think looks best.


Older posts are available in the archive.