Sitemap

Member-only story

SPARK-SOLACE Connector, Writing to Solace in a Distributed way using Spark.

4 min readJul 28, 2020

--

solace running in docker on the local system

Solace is a queueing mechanism following the pub-sub(publisher-subscriber ) pattern. A producer can publish events/messages to a solace Queue/Topic and the consumer can subscribe to a topic or consume a queue as per the requirement.
The problem was to send events to Solace from HDFS (Hadoop Distributed File System) in a distributed fashion along with adhering to some other business alignments. There are multiple ways to publish and consume data from Solace. One way is via Rest Services which is pretty straight forward if we follow Solace’s Documentation. If you have millions of rows which are distributed as blocks in an HDFS cluster (or some other underlying structure as the idea is generic and can be applied to other file systems such as Amazon S3) and if you want to send data in a distributed way to Solace you can leverage Solace JMS-API to publish data to Solace’s Queue or Topic.

Design:

The Architecture was to open a connection object to Solace’s Broker per partition of Spark Executor and send rows of each spark partition either as individual events or as an Array of Json’s by clubbing all rows…

--

--

Harjeet Singh
Harjeet Singh

Written by Harjeet Singh

Problem Solver, writes on Tech, finance and Product. Watch out for my new creation, "THE PM SERIES"

Responses (1)