In part 1 of this series, we covered the definition and types of streaming data.  That blog can be found here.  Once you have identified your business need to utilize streaming data in your Azure analytics solution, it is time to determine what options and services are available to connect to and ingest this data. 

Within the Azure Streaming Analytics platform, the following options are available, IoT Hubs, Event Hubs and Service Buses.  In part 2 of this series, we will be ingesting streaming data using Event Hubs.  We will be leveraging Python to generate some telemetry data to push to the Event Hub.

 

First, we need to create the event hub resource in our Azure Resource Group.

 

 

 

Create the Event Hub Namespace.  The namespace is a container that will allow you to create one or many event hubs.  I chose the basic pricing tier for demo purposes.  Throughput controls the Event Hub traffic.  One throughput unit allows 1 MB per second ingress and 2 MB per second egress.  Since this is only a proof of concept, I chose one Throughput.  This may be a setting to evaluate based on data size and frequency of ingestion to prevent throttling.

 

 

 

After we create the Event Hub namespace, we need to create and configure the Event Hub itself.  The Event Hub is the resource that receives, processes and stores data messages

 

 

 

There are several configurations in creating an event hub that can be important depending the amount and frequency of data being pushed to it.  The partition count determines how the subset of messages are sequenced and read.  Newer data is always added to the end of the partition.   The level of concurrency for reading event hubs has a huge impact on performance.  Since partition counts cannot be changed after the fact, they should be set according to the expected workload for scalability.  Microsoft recommends the number of partitions should be equal to or greater than the number of throughput units for best performance.  Message retention determines how long messages are stored in the event hub.

 

 

 

Now that we have the Event Hub instance created, we need to get the shared access key to define how to connect to the Event Hub from an application.  From the namespace window, select Shared access policies.

 

 

 

 

Once in the shared access policies, select RootManageSharedAccessKey.

 

 

 

You want to use the primary connection string of RootManageSharedAccessKey.  Click the copy button to place it in your clipboard.

 

 

 

For the sake of this demo, I created a python script to generate record sets to send as messages to the event hub.  Below you will see the test code and parameters configured.  The key components of this script are ensuring you have the json and service bus libraries for python.  There is also an event hub library as well so feel free to use whatever libraries you feel most comfortable with.  We will use the shared access key from above to make the connection to our Event Hub.  The remainder of the script is simply generating random weather data and sending to the service bus as events in the form of json.  Replace  the following items with your own configuration values:  ADDRESS, USER, KEY, service_namespace, send_event.

 

import uuid
import datetime
import random
import json
from azure.servicebus.control_client import ServiceBusService

# Address can be in either of these formats:
# "amqps://<URL-encoded-SAS-policy>:<URL-encoded-SAS-key>@<mynamespace>.servicebus.windows.net/myeventhub"
# "amqps://<mynamespace>.servicebus.windows.net/myeventhub"
# For example:
ADDRESS = "amqps://StreamWeatherData.servicebus.windows.net/weatherdatamessagestream"

# SAS policy and key are not required if they are encoded in the URL
USER = "RootManageSharedAccessKey"
KEY = "fh6m8IpVSACft9UQyConSxWIbs/TBtqUHfAEDJesWiw="

sbs = ServiceBusService(service_namespace='StreamWeatherData', shared_access_key_name=USER, shared_access_key_value=KEY)
devices = []
for x in range(0, 10):
    devices.append(str(uuid.uuid4()))

for y in range(0,10):
    for dev in devices:
        reading = {"WeatherTimeStamp": str(datetime.datetime.utcnow()), "Temperature": random.randint(20, 100), "Visibility":  random.randint(0, 10), "WindSpeed": random.randint(0,60) }
        s = json.dumps(reading)
        sbs.send_event('weatherdatamessagestream', s)
    #print (reading)

 

 

Once you modify and run the above python code in your IDE of choice according to your environment, you should be able to refresh the metrics section of the Event Hub.  Below you will see the 10 events we generated from the python script.

 

 

 

Now that we have data in our event hub, we have a couple of options for integrating and using the data.  We can set up a streaming job or create an event followed by some sort of trigger.  For the streaming jobs option, I will be releasing another blog shortly to finish the 3rd and final part of this series.

Stay tuned…

 

Spread the love
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>