Resource Handling of Collection of Resources in Scala and challenges with Java try-with-resources

Afsal Thaj
Level Up Coding
Published in
6 min readNov 7, 2020

--

Photo by Tamanna Rumee on Unsplash

There are wonderful libraries in Scala ecosystem for resource handling. I recommend the following as a quick start.

  1. ZManaged from ZIO (https://github.com/zio/zio)
  2. cats.effect.Resource (https://github.com/typelevel/cats)

These are superior libraries to many of those in other popular* languages, though I find the comparison is mostly with java try-with-resources. The simple story of this blog is, the resource management techniques that you can leverage with these libraries mentioned above is not just a shiny alternative to Java try-with-resources , because some (well, many) important semantics are simply missing in try-with-resources, (detailed further below). Hence, I find these libraries to be sort of a necessity when it comes to handling resource-intense usecases. That along with Fiber-based concurrency model enables us to acquire millions of resources in parallel, with well defined release strategies on interruptions and failures. These techniques are available in Scala (when it comes to JVM) 3–4 years before. It won’t be feasible to discuss all of it in this blog, hence I will be focussing on a code snippet to give a gist of what these things are.

NOTE: Talking of concurrency, Project Loom is around in JVM, however it doesn’t add anything extra and is not too much of a surprise to those who already use Fiber based concurrency model in JVM through ZIO/Cats years before.

Pre-requisites to further reading

1. Knowledge in Scala syntax
2. Knowledge in cats.effect.Resource

This blog is more of code, probably making sense for those who have already read through cats.effect.Resource .

Below given is a code-snippet summary of the blog for those who want to know just the summary of the whole blog.

For those who are would like to know more about the intuition and detailed explanation, keep reading from “A quick example.”

val listOfResources: List[Resource[Connection]] = List.fill(100)(Resource.mk(acquire)(release))val resource: Resource[List[Connection]] = listOfResource.sequenceval dbResult: IO[DbResult] = resource.use(r => // use all connections or not, all connections will surely be closed)// or use it one by one, making sure conn are closed as and when they are used.val dbResults: IO[List[DbResult]] = 
listOfResources.traverse(resource => resource.use(con => getDbResult(con))

A quick example.

Say you are downloading files from Amazon S3 directory, and you are doing a bit of processes after you downloaded these files within your local machine / within your application memory – may be encrypt the contents to a separate temporary file and then send these encrypted files (along with grabbing their metadata such as size, checksum etc) to another downstream system.

There can be multiple files, or a single file, and users may choose to send them one by one, or together, or may be concurrently. On top of this, you have to make sure that you are not sending duplicate files to downstream, that means, you got a state to manage as well.

A quick starting point is here. It’s just a minor decoration, yet useful decoration on top of cats.

Let’s create a downloadAllFiles function that can download files from S3. Not a great name, but names really don’t matter here. Discard the details in the code, and mainly what matters is the signature of downloadAllFiles. Could you guess why does it return the type Resources?

As a summary, downloadAllFiles doesn’t do anything — Its just returns List[cats.effect.Resource[IO, TemporaryFile]] (and not Resource[IO, List[TemporaryFile]] yet) which is then wrapped in Resources data type. This is sort of restricting the user in quite a lot of ways. They have to call use (or executeAll) to do anything with Resources, forcing at compile time to think about resource acquisitions and releases.

In simple terms, a truly guaranteed resource release. Let’s see implementation of a TemporaryFile. In short, its just cats.effect.Resource of File.

That was a mouthful of code, but mostly they will become black-boxes in application code, and usage of them will be 1 liners for a variety of usecases.

I will run through a few examples, that uncovers the use-cases clearly.

Example 1 : Most obvious usage — Send files one at a time, yet keep a state

downloadAllFiles(...).use((_, a) => sendFile(a))

The above usage makes sure of the following:

  • All files will be downloaded individually and copy the contents to a local renamed file
  • Any failures during this acquisition will guarantee resilient behaviour — that it catches and tells the user what’s going on along with closing of resources if it has opened anything meanwhile.
  • Use each one of these files one by one by passing it to sendFile (to some destination). And guarantee cleaning up all the resources associated with it individually.In a way this implies, resources won’t be allocated until it is time to execute corresponding sendFile.
  • This prohibits downloading multiple files (possibly in an s3 path, having a decent size) before they need to get transferred. This avoids memory crash to a great extent. Also a failure in one of the file transfer results in deleting the corresponding resource instantaneously.
  • You can also keep a state of executions, say to avoid sending multiple files, or accumulate the size of files. Since these files don’t exist together at any point, we need to keep a state. For example, below given is a code to make sure we are not sending duplicate files without collecting resources into memory, allowing instant clean ups.
downloadFiles(...)
.use(
Resources.trackDuplicate(sendFile)(_.toString)(
name => s"Duplicate file tracked. ${name}"
)

Example 2 : executeAll —Gather all resources to fetch a list of files, send them together and release resources together.

downloadAllFiles(...)
.executeAll
.use(allFiles => sendAllFiles(allFiles))
  • In this case, resources are accumulated (and will not initiate any further process of cleaning/usage until all of them are acquired)
  • Meaning, it allows you to use the collection of downloaded-renamed files together at a later point in time. sendAllFiles is an example.
  • It guarantees release of all resources associated with the collection, after a successful or failed execution of sendAllFiles
  • A practical example: for a JDBC transfer using spark, we may need to collate all resources of files, and close them only after a collective transfer to SQL database.

Example 3: Gather resources together, but send files one by one (a version of Example 2)

You can also acquire resources together, but use each resource individually (may be some complex process) and guarantee release of all resources at all point of times.

downloadAllFiles(...)
.executeAll
.use(allFiles => allFiles.traverse(a => sendFile(a)))

The whole idea is fairly equivalent to sort of having Collectionof Composable(version) of a descriptive try-with-resources in Java. But such a thing doesn’t exist.

Challenges with try-with-resource in Java

Porting the above modular solution of resource handling back to Java try-with-resource is reasonably challenging, however, it’s a good exercise to try:

  • try-with-resources has edge case (apparently I heard it’s being fixed) where try(var f = foo()) may do something with resources that can fail and it’s never wrapped in a try catch internally. That is your first resource acquisition can fail. In this case probably you may wrap try-with-resources with and another try-catch (I don’t Know)
  • The next challenge is how can we make sure try-with-resources not have this conflation of acquisition and release of resources, along with its usage. You can’t separate them because the “Use” of resources should go along with it, meaning how do we safely acquire resource in one place, yet make them close after they are being used elsewhere.
  • Also try to acquire resources in parallel and use them individually, yet make sure all of them are closed once all of them are used. Meanwhile if any one usage fails, make sure you release all the other resources.
  • If you managed to jump through all of the above hurdles, then read the code and see if we can statically prove that all the places of resource usages is bound with a resource release.

Once all of this done, post your results in comment section for everyone’s benefit :)

I hope, you got something out of my writing. I will improve it further with more context and reasoning if you think it’s too abstract.

I thank John De Goes for sharing his wonderful thoughts on Java try-with-resources which inspired me to write more about challenges of try-with-resources. I recommend going through John’s courses through: https://patreon.com/jdegoes

Cheers !

--

--

A software engineer, traveller, fitness trainer, hiker, skydiver and I write at times. https://twitter.com/afsalt2