Serverless Big Data

It’s been a long time since I’ve blogged here, but figured it’s overdue. I’ve been busy in the Azure world and am really excited to see how messaging has really started to shape the modern cloud landscape. Creating and shipping Azure Event Grid was perhaps the highlight of my career thus far, certainly at Microsoft. I’ll write a long over due blog about that shortly.

Seeing Azure Event Hubs grow to 2 trillion requests per day in the time I’ve been with it has also been a great experience. In this time not only has messaging become core to the cloud in general and Serverless in particular, but new patterns are starting to emerge that are really exciting. One of the most exciting this month has been seeing the traction our quietly released new Apache Kafka endpoint for Event Hubs is gaining and the new directions it is driving. The Kafka endpoint feature is available in some regions of Azure today and you can give it a try following this quick start. This allows you to use Kafka producers and consumers to read and write from an Event Hub.

By combining this with Azure Functions Event Hubs Binding you can create a Serverless Kafka processor in just a few minutes. The blog Processing 100,000 Events Per Second on Azure Functions shows you how easily you can scale this Serverless processing to pretty high scale very easily. This means you can really start to create truly Serverless Big Data solutions on Azure today. This is an exciting time and we will see more development in this space as users of these platforms drive innovation on the platforms which we and other cloud providers are starting to provide.

Give this stuff a try and tell me what you think!

Welcome to the Messaging Party Google

This week Google announced the public release of Cloud Pub/Sub and we on the Azure Service Bus team would like to welcome them to the cloud messaging space. This is a big and growing market with a diverse set of competitors, technologies, and strategies. We feel that Google’s decision to enter this space validates our investments in Service Bus Messaging (Queues and Topics) and reflects the growing realization in the industry that messaging is a critical component of scalable applications and a vital part of any cloud architecture and of any cloud platform.

While it may at first appear that Google Cloud Pub/Sub and Service Bus Messaging are directly competing with each other the services are quite different and each has its own strengths. More importantly the real competition for both our services, and the other players in the space, is not each other, but direct application integration.

There has always been a tendency to directly wire applications to each other in a piecemeal organic fashion that results in brittle tightly coupled software. This tendency predates cloud computing and even network computing. Experienced architects know the problems that arise from this design – and know to avoid it. The cloud amplifies these problems. Hopefully Google will now impress on another group of engineers and architects the importance of well-established architectural principles of loose coupling and separation of concerns. Principles that have always been at the center of  messaging architecture and guiding points for the Azure Service Bus.

Azure – The Operating System for the 21st Century

Now that I truly live Cloud computing every day as part of the Microsoft Azure product team I thought I’d share a few reflections about the evolution of Cloud computing over the past few years and how I think we’ve really crossed a threshold with the technology of cloud computing.

I recently commented that the cloud really is the operating system of the 21st century and I genuinely mean that.  Here’s why.  When you look at an Operating System (think back to your Concepts of Operating Systems class if you took one) what you’re talking about is a piece of software that manages the hardware of a machine.  It’s job is to enable us to use the machine to do our bidding.  This ranges from basic features such as facilitating I/O, storage, and computational capabilities to more complex tasks such as networking, multitasking, and job scheduling.

Over time operating systems evolved to be very rich environments that we know today.  Looking through the current Azure feature set it quickly becomes apparent that Azure really has matured into a true Cloud OS – the Operating System of the 21st century.  Storage and compute are some of the oldest services and also mimic the evolution of operating systems – think way back – when the von Neumann architecture was a cutting edge concept.  Maybe even in the OS/360 timeframe.  Personal computers followed a similar path: from my Apple IIe which was really just storage, compute, and I/O to current operating systems that are truly rich experiences.  The cloud is on the same path – and Azure has progressed in a very short time from the cloud equivalent of DOS to a rich computing experience like nothing the world has ever known before.  This includes many concepts we would recall from Operating Systems: a job scheduler, compute, storage, I/O, and a powerful communications bus (yes, Service Bus).  The most striking part is that this really isn’t a Windows OS – it is an OS unto itself that is based very much on open protocols and can be leveraged by any client, or even server, OS.

It was a big risk for Microsoft to invest so heavily on the cloud – I appreciate that more being here and seeing how all in the company is.  At first I wasn’t sure if this was really sure about this, but viewed in the context of the cloud being an Operating System for the future – it makes perfect sense.

The Time Value of Data

I am doing more work than ever with the Internet of Things these days and I’ve wanted to write on this topic for some time. A larger article is in the works for publication, but I’ll give the high level here. Over the last few years my work with Smart Grid in particular and Big Data in general has made me acutely aware of a concept I have started calling the Time Value of Data. This was inspired by my interest in economics and draws its inspiration from the Time Value of Money which dates back nearly 500 years and to a city in Spain that I have always enjoyed visiting.

The theory behind the time value of money is quite straightforward: money today has a future value that is different from the current value. That is capital has a value that changes over time: in a “normal” environment this means some amount of money today is worth that amount plus some more in the future. This is actually a rather complex topic, but plenty has been written about it.

What I want to focus on here is the value of data over time. Data generally has a unique value curve that is different from most other commodities – and yes, data is a commodity (or at least is becoming one). When we think about the Internet of Things in particular – devices, appliances, sensors, and telemetry – it becomes quite apparent that some of this data is going to have high immediate value. A fire alarm is a great example. Knowing about a fire is extremely valuable as it starts. This may allow for safe evacuation or event containment. As time passes the value of this information drops. Do I really care that my building had a fire several hours or days ago? Many of the sensors in use today are focused on this immediate value area.

There is also a secondary data story that is historical or collective data. This is where you can save data in a raw form long enough to gain value from it. Good examples of this are climate data, defect rates, energy usage. As more of this data is collected over longer periods of time the value of it increases dramatically. Although the individual data points may not be as valuable, collectively the data set becomes even more valuable. This is depicted in the chart below (I said this was a rough draft).

TimeValueOfData

As I mentioned, this is an idea I am still formalizing and will have an article about soon – so I invite any comment or contributions on this. Perhaps this is more of a U than a V shaped curve or maybe the right side doesn’t rise as high, but the concept is fairly robust when examining use cases.

More details on this and the implications will follow.

4 Reasons the “Smart” Grid is Dumb

Disclaimer – These are my opinions and mine alone. They do not represent the views of my employer or any organization I am a part of.

I work heavily with “Smart” technologies: in the energy and utilities sector, in the manufacturing sector, and in telemetry covering retail and a few other areas. Over the last few years my work in Smart Grid has been fairly extensive. If you don’t know already “Smart Grid” basically means advanced telemetry built into every segment of the energy grid (though often smart meter and smart grid are actually different things; to most people smart meter and smart grid are the same). In my time implementing and consulting in this area I’ve come to see that there are a few really dumb things about the smart grid.

1) Lack of true standards at almost every level – Technology standards are what make the world interoperable. Ever send a text message or use the Internet or make a call from your mobile? That’s because standards allow devices and equipment from many vendors to work together – in two of those examples those standards are from the GSM Association. It is this interoperability that provides long term viability for the overall market: for the vendors, for the providers, and users. There are very few standards in the Smart Grid arena and there is almost no equipment interoperability. The really bad part here is that Smart Meters aren’t that different from the rest of the Internet of Things (IoT) and should be sharing standards with other parts of the IoT ecosystem.

2) No cloud first implementation strategy – None of the major vendors in the area are pursuing a cloud first strategy. From a technology standpoint most of this twenty first century infrastructure is being solved with late twentieth century architecture. There is a lot of expensive on premise technology that would feel right at home in the late 1990s. Cloud is important for valid reasons on both ends of the utility spectrum: small and large. Small utilities require a cost effective solution to implement this technology and realize the benefits. They cannot afford expensive highly available platforms and their small load factors don’t require it, yet the industry at large only offers them expensive on premise solutions that are overkill for most. Large utilities face another problem that a cloud first strategy would solve: scale. A large utility is going to have millions of meters and they will be providing telemetry at timeframes as short as 15 minutes. This is going to create a lot of data. Let’s look at an example:
5 million meters x 96 readings per day (i.e. 4×24) = 480,000,000 readings

This is just meters! Telemetry on the distribution side could actually be even larger as the readings are likely to be more frequent. The result: Some seriously Big Data (another blog on that shortly). This load from the meters alone would break down to 5555 readings per second on average 24 hours a day, 7 days a week. Although that number is not that big, these events are likely to come in huge bursts. The software and platforms being selected to handle this load are not up to the task on either the messaging (delivery) or data (processing / storage) sides of this challenge. Many vendors and their relational / legacy data platforms think this will scale just fine – throw more hardware at it. It also allows them to sell more licenses and hardware. Unfortunately it just won’t work.

3) Lack of publish subscribe architectures – Building on issue 2 there is the very serious and technical aspect of architecture to be addressed. To be sure we’re early in this Smart Grid game, but most of the solutions so far are trying to use web services at best and sometimes just batch processing to handle this data. This is a true travesty that I think may be the result of some insular group think. Even when web services are used they often don’t incorporate WS-* standards and almost always rely on polling, which also doesn’t scale. The environment that is ultimately developed ends up being an archipelago of services and data that do not build broad scale extensibility into their design. Most of these architectures end up causing load and scale problems so the vendors and users end up falling back on batch processing. This greatly diminishes the value of Smart data as it arrives with a great delay that stops it from being used for real time processing scenarios – which promise to provide the greatest innovation in the arena. Ultimately Smart systems need true publish subscribe capabilities built into their core to provide scale and extensibility. This is the only way to facilitate the development and addition of new components and capabilities without reengineering an expensive and possibly brittle implementation. But what sort of features and capabilities would require this architecture? Glad you asked! Perhaps things like real time analytics to provide predictive failure, demand shifts, weather patterns. Like the Internet, it is not so much what we have thought of that will make Smart systems so successful, but what we will think of once a solid platform is in place. Publish subscribe is the key to extending these platforms to unlock their true value in the future – ideally with standardized protocols that create an open ecosystem.

4) Heavy vendor lock-in – This last point is really a culmination of all the others. Vendors produce their own parts of this Smart ecosystem with little thought of the larger environment and with a desire to protect revenue with a relatively short focus. This manifests itself in single vendor meter networks, closed platforms, and limited extensibility. I know we’re all in business to make money, but if the ecosystem isn’t healthy and providing choice and competition then this money will be short lived for “Smart” as much of the value will be difficult to realize and innovation will be slowed. This is still early in the technology, so I think this will change as the industry matures and vendors realize that they can all have slices of a bigger pie if they embrace interoperability.

The good news is that there is hope. We are very early in the creation of the Smart ecosystem and some participants are starting to take notice, much like how mobile operators did in the past. Standards like AMQP are providing wire level interoperability for a publish subscribe architecture that is vendor agnostic and free to use. Some utilities are starting to demand support for robust open protocols. I have particularly seen this in European utilities where I believe there may be more historic precedent for interoperability. Some members of this community are starting to look beyond the Utilities sector for inspiration and advice from other industries that have faced these exact challenges in the past like telecommunications, financial services, and banking. All of these thing bode well and if embraced will stop making the Smart Grid so dumb. It will be interesting to see.

Apache Storm on Windows

In a release in February Apache Storm community added Windows platform support for Storm 0.9.1

I for one have been very excited to see this.  The Hortonworks distribution of Hadoop (HDP) is the only one that runs on both Windows and Linux and this gives a lot more choice to traditional enterprise clients.  I’ve been working with HDP for about a year and a half now and really like the experience – both on Linux and Windows. 

Storm is a very exciting development in real time data processing using a Hadoop cluster.  This is useful for running models that you’ve created by more traditional batch processing and map reduce within Hadoop.  Storm uses a simple spout and bolt topology for processing tuples of information at scale and in real time.  More information can be found at the storm site: http://storm.incubator.apache.org/

I am now wondering if this technology, now running on Windows, will make it into the Windows Azure HDInsight service.  I certainly don’t have any inside information on this, but I’d be interested to see it. 

Wayfinding, Simplicity, and Design

Looking back on the last few years and the amount of travel I’ve done I’ve realized that the art and science of Wayfinding is an excellent tool for user experience testing and specifically for testing devices or apps.  According to Wikipedia: “Wayfinding encompasses all of the ways in which people and animals orient themselves in physical space and navigate from place to place”.

I’ve begun testing this theory out after long haul flights.  I have found that this is a peculiar time in human consciousness when your normal abilities of reason and logic are deeply impaired.  When flying long haul everyone experiences a certain amount of discomfort even when travelling in style.  It could be the dry recycled air or the small and highly used lavatories, or the lack of space in the back of the plane, or even the abundance of libations in the front.  After an epic journey (especially transpacific) everyone is out of sorts.  Yet we all find our way through customs and to the train or taxi that we’re looking for.  I recently pulled a 28 hour 11 time zone journey that involved four airports, three flights, and two sets of immigration.  At the end I found my rental car shuttle (yes, I am an American, I rent cars), found my car, and then found my way to the hotel.  Believe me none of this is due to any special abilities I have in navigation or even common sense – it is completely due to the wayfinding design principles that have been used throughout the world to show us where to go.  This idea first came to me after reading one of Garr Reynolds books.  I thought his presentation of this was brilliant.  This is design that must work, for a large variety and number of people.

This is what has lead me to testing my new apps and devices in this state of mind.  Case in point I learned on this particular journey they my non-model specific mobile phone windshield mount has a terrible design flaw with my Nokia Lumia 1020 – or for that matter any Windows Phone: the camera button is in the area where the side clamps hold the phone in place.  Result: I’m looking at a live (and small) image of the nighttime road ahead of me instead of my Nokia Drive app.  Fortunately getting back to an app on Windows Phone is easy – even after a 28 hour trip (there’s some good design).

Now whenever I build an app – or my team does – I always try to get that same level of detachment when I review it.  I’ve even begun to extend this to mock ups, concepts, and presentations.  Sometimes I learn where a user flow is confusing or the next step in unclear.  Since I started writing this I traversed the Atlantic – twice – after the first flight I learned that my presentation on Real World Business Activity Monitoring for BizTalk Summit 2014 had a rather strange sequence in it that didn’t flow as well in this reduced functionality state.  I rearranged some content and dropped some that didn’t fit as well, then it seemed strong.  The crowd seems to have agreed thankfully!

I suppose this last part of Wayfinding is sort of the key to it all: remove that which is not completely necessary to convey the message / information.  Anything else is waste or distraction.  Next time you travel anywhere check out the signage and notice how relatively easy it is to navigate.  This is a good inspiration.  When searching for simplicity use that long day or that sleepless night to your advantage to review something you’ve been thinking about too much, this will give you a different perspective on the topic.

BizTalk Summit 2014 London

This week I had the pleasure to speak at the BizTalk Summit 2013 in London which was sponsored by BizTalk 360.  I have to say it was the best BizTalk event I’ve ever been to.  My presentation is posted here but the slides aren’t much without the presenter… or the videos showing how I implement BAM on an order processing solution with zero code – fortunately there is a video to be posted by BizTalk 360 if you’re interested in seeing the message.  The sample solution I use, with all artifacts, is located at https://danrosanova.wordpress.com/rwb.

The talk was about Real World Business Activity Monitoring and the message was well received.  In short I drive home that we owe it to our customers (i.e. business people) to provide BAM so they can see what’s happening in the way they are comfortable – which is normally Excel or Reporting Services.  I took away two strong points from giving this talk.

First – more BizTalk shops use BAM than I had ever imagined.  Its use is fairly limited in the US, sadly, and some markets, but clearly many people use it.  About half the audience said they use it in production – I expected 1-2%.  This is encouraging as BAM is really worth doing.  It’s easy to implement and delivers high value at both technical and business levels.

Second I learned that there is still a lot of interest in building these sorts of dashboards and that many people were eager to give BAM a try given how easy my presentation makes it look.  There is also a lot of interest in BizTalk – which I was really glad to see.  It is a great platform and used correctly has a strong place in many organizations.  I will be blogging a lot about BAM in the coming weeks and about Big Data, which brings me to my final take away.

Big Data is a big topic and it’s in the news a lot.  My next four presentations all focus on Big Data and Advanced Analytics and I’m going to dedicate some time to bringing these together with BizTalk and with BAM.  My next talk at the Informs Conference on Business Analytics and Operations Research on April 1 in Boston is focused on Situational Awareness with Big Data tools.  It is very much like BAM with Big Data with even more dimensions.

Inspiration from Dyson CEO and West Monroe Partners

A little while back I read a great interview with Dyson Chief Executive Max Conze that really made me think about my career and the firm that I work for: West Monroe Partners.  In this article, which really was fascinating to me considering how innovative Dyson is and how high profile their founder is, Mr. Conze states: “The best you can do is hire a lot of smart young people, give them a lot of responsibility and they’re going to grow on it”. 

That is a truly profound statement – and the exact opposite of how most organizations work, but not all.  Nowhere in my fifteen year career have I seen this done more effectively than at West Monroe Partners, where I have been for nearly three years.  We hire bright young people and very quickly they have a lot of responsibility and a lot of freedom.  This helps them develop extremely fast and I find myself trusting people ten or more years my junior with tasks I hardly trust myself with. 

Dyson is obviously doing amazing and innovative work, and so is West Monroe.  I am really coming to realize it is because the way we hire and the people we hire that we are able to be so agile and so innovative.  It can be very tempting to not delegate to junior staff, but is important for them, for you, and for your company to do so. 

Mr. Conze credits his military background with this philosophy and it reminds me of a quote by a military legend: “Never tell people how to do things. Tell them what to do and they will surprise you with their ingenuity” – General George S. Patton Jr.

I really am proud and fortunate to work at an organization that places such value on its people.  At the end of the day it is all a consulting firm has.  We do hire smart young people (and smart older people like me as well).  And we’re hiring now!

 

The biggest change in BizTalk 2013 and how to undo it

As I said earlier, been doing a lot of BizTalk lately and I’m definitely loving the BizTalk 2013 changes. The XSLCompiledTransform is one of the things I’ve been really happy to see. I’m planning some real Side by Side benchmark numbers for a blog soon, but this is a feature that I think came out of research Paolo Salvatori wrote about in How To Boost Message Transformations Using the XslCompiledTransform class. Paolo you are my hero.

I was recently working on a very large very complex BizTalk implementation that is being upgraded and stumbled upon some very strange behaviors I could not fully understand. The first issue was that maps that I have good reason to believe are exactly as they are working in production now started to fail during testing. Eventually I tracked down that the maps used inline C# in scripting functoids and did not mark them as public. At first I thought the code must have been changed in these maps, but it hadn’t (thanks ILSpy). So I tried the maps back on an older BizTalk machine and sure enough, they worked. I know the XslCompiledTransform replaced the XslTransform class back when .NET 2.0 came out (and was happy as I did a lot of XSLT and XML in .NET before my BizTalk days). I also know BizTalk 2013 uses this much faster transform class. I decided to check out Migrating From the XslTransform Class on MSDN. Sure enough there was my answer under Extension Objects and Script Functions:

XslCompiledTransform introduces two new restrictions on the use of script functions:

  • Only public methods may be called from XPath expressions.
  • Overloads are distinguishable from each other based on the number of arguments. If more than one overload has the same number of arguments, an exception will be raised.

So this was a bummer, but thanks to some handy RegEx skills I found all the places this was an issue in every map in the very large solution quite quickly (one of my computer science professors is smiling right now).

So life was good, or so I thought. Deeper into testing some results were not as they were expected. I looked at these issues and again ended up looking at maps with inline C# in scripting functoids. Testing these maps on my workstation I could see they were not working as I had expected. It seemed like implicit Boolean conversion issues were happening. I changed a few maps and went on with my work, but eventually the scope of the issue dawned on me. This wasn’t some maps, it was all the maps that used inline C# (which is something I don’t like anyway) with Boolean parameters. Now I had a real problem. I reached out to the super-secret group of BizTalk experts I’m a part of, but I must not have used the proper secret handshake as no one replied.

I got my testing to the point that I could walk through the maps, which was complicated by the fact that they also used external classes (thank you Maxime Labelle for Debugging XSLT Stylesheet with Custom Extension Objects from Within Visual Studio). I ended up dropping the method calls and extension objects and being able to reproduce this just inside of Visual Studio with XSLT debugging. The strange part was that the value in question would be a Boolean in the XSLT debugger. The BizTalk Mapper turns all such parameters into strings (for good reason) and even debugging the string($valx) in Visual Studio returned the correct value, but as soon as the .NET method was invoked the parameter would be passed in as true – no matter what, even if it was false.

Eventually, exasperated I turned to a Premier Field Engineer from Microsoft who I had crossed paths with. I showed him what was happening and he confirmed I hadn’t completely lost my mind – or at least not on this issue. He soon came back to me with this: http://support.microsoft.com/kb/2887564/en-us which despite my best Bing-ing I was unable to find myself (I even tried that other search engine out of desperation, but nothing). Here was the answer! As I read on I grew slightly concerned that this would be a “by design” sort of answer like other software companies give, but this is Microsoft! The company whose biggest weakness is staunchly maintaining backwards compatibility IMO – another blog perhaps. But there was hope, further in the article was my salvation:

It is also possible to configured the BizTalk 2013 Transform Engine to use the older XSLTransform class. This approach is not recommended since the environment will lose the many performance and memory usage improvements provided by the XSLCompiledTransform class. This change can be made by adding DWORD UseXslTransform with value 1 at the following locations:

  • For 64 bit BizTalk host instances: HKLM\SOFTWARE\Microsoft\BizTalk Server\3.0\Configuration
  • For 32 bit BizTalk host instances and Visual Studio’s Test Map functionality: HKLM\SOFTWARE\Wow6432Node\Microsoft\BizTalk Server\3.0\Configuration

Not only was there a way around it, it was already built into the product! No patch, or hotfix, or anything!

I know I’m losing performance with this, but I’m also not changing dozens or hundreds of maps either. BizTalk ran fine before and will run fine still for these needs.

Thank you to the BizTalk development team for doing this right and to that PFE who saved my bacon.