CSV Connector for WSO2 BAM

I’ve worked on a small tool to publish your spreadsheets (after converting to CSVs of course!) to WSO2 BAM.

The cool thing is you can publish 1 (or 1000s) of spread sheets to WSO2 BAM, and use the HiveAnalytics UI to slice and dice them to produce neat results.

So, you need maven to build and run this (and of course, WSO2 BAM up and running). Here are the steps:

1. Download and unzip the source from this link.

2. Run ‘mvn clean install’ at the unzipped location.

3. Now run the exec command in maven as per the following example: ‘mvn exec:java -Dexec.mainClass=org.wso2.carbon.bam.CSVAgent -DcsvFile=../ExportCustomerAccounts.csv -DstreamName=CustomerAccounts -DstreamVersion=1.2.0’

Here is what happens:

“CustomerAccounts” is the stream that will get created out of the CSV file, “ExportCustomerAccounts.csv”. All streams are versioned in BAM, so this stream will have the version “1.2.0”. Versioning means you can publish different versions of the CSV (columns deleted or added) with different versions.

The potential of this is you can publish any number of CSVs to BAM and make use of the SQL-like Hive query language to do joins and group bys to get valuable information out of your spread sheets.

Introductory webinar on the open source WSO2 Business Activity Monitor 2.0.0

Link to register : http://wso2.org/library/webinars/2012/09/introducing-all-new-wso2-bam2-all-your-business-monitoring-needs/

I will be doing an introductory webinar on the recently released WSO2 BAM 2.0.0 at the following times on Wednesday, 19th September, 2012:

  • 09:00 AM – 10:00 AM (PDT)
  • 10:00 AM – 11:00 AM (GMT)

In that I will be two quick demos based on,

  • Defining custom KPIs and Analytics based on data from an iPhone App
  • Monitoring the WSO2 servers through the Service Stats toolbox

Here is the official content of the webinar:

In a webinar conducted earlier this year, we presented a preview of WSO2 BAM2 which is a complete re-write of the BAM 1.x versions. We explained how WSO2 BAM2 addresses the requirements of customization, scalability and performance based on NoSQL data storage, super fast data transfer rates, configuration based analytics, and WYSIWIG UI development tools.

Today WSO2 BAM2 is available with a whole new set of features and capabilities such as

  • Collecting and Storing any type of business data
  • High Performance Data Capture Framework with a REST API
  • Pre-Built Data Agents for all WSO2 Products
  • SQL like analytics language
  • Scalablea analytics based on Hadoop
  • Dashboard and Reporting capabilities

and more..

Join Tharindu Mathew on this webinar as he takes you through the enhanced features and capabilities of WSO2 BAM2 and demonstrates how they can be applied in common business scenarios such as,

  • SOA server monitoring
  • ESB monitoring
  • Custom KPI Definition and Monitoring

 

BAM, SOA & Big Data

Leveraging Big Data has become a commodity for most IT departments. It’s like the mobile phone. You can’t remember the times when you couldn’t just call someone from your mobile, no matter where you are in the world, can you? Similarly, IT folks can’t remember the days when files were too big to summarize, or grep, or even just store. Setup a Hadoop cluster and everything can be stored, analyzed and made sense of. But, then I tried to ask the question, what if the data is not stored in a file? What if it was all flying around in my system?

Deployment

Shown above is a setup that is not uncommon deployment of a production SOA setup. Let’s summarize briefly what each server does:

  • An ESB cluster fronts all the traffic and does some content based routing (CBR).
  • Internal and external app server clusters host apps that serve different audiences.
  • A Data Services Server cluster exposes Database operations as a service.
  • A BPS cluster coordinates a bunch of processes between the ESB, one App server cluster and the DSS cluster.

Hard to digest? Fear not. It’s a complicated system that would serve a lot of complex requirements while enhancing re-use, interoperability and all other good things SOA brings.

Now, in this kind of system whether it’s SOA enabled or not, there lies a tremendous amount of data. And No, they are not stored as files. They are transferred between your servers and systems. Tons and tons of valuable data are going through your system everyday. What if you could excavate this treasure of data and make use of all the hidden gems to derive business intelligence?

The answer to this can be achieved through Business Activity Monitoring (BAM-ing). It would involve the process of aggregating, analyzing and presenting data. SOA and BAM was always a love story. As system functions were exposed as services, monitoring these services meant you were able to monitor the whole system. Most of the time, if the system architects were smart, they used open standards, that made plugging and monitoring systems even easier.

But even with BAM, it was impossible to capture every message and every request that passed through the server. The data growth alone would be tremendous for a fairly active deployment. So, here we have a Big Data problem, but it is not a typical one. A big data problem that concerns live data. So to actually fully monitor all the data that passes through your system you need a BAM solution that is Big Data ready. In other words, to make full sense of the data and derive intelligence out of the data that passes through modern systems, we need a Business Activity Monitor that is Big Data ready.

Now, a system architect has to worry about BAM, SOA and Big Data as they are essentially interwined. A solution that delivers anything less, is well short of a visionary.

WSO2 BAM 2.0.0 released!

 

The screenshots above show the final result of a Service statistics monitoring use case. Data across many servers got published to BAM, had to be analyzed and then presented on the dashboard you see above. Nothing better than a cool dashboard to make sense of all that data 😉

It has been an enduring journey with an abundance of learning curves that allows the BAM team to make some great technologies work together seamlessly. After, spending almost an year on a complete re-write of the WSO2 Business Activity Monitor, we were able to put the 2.0.0 release of this product, which is a complete re-write of the 1.x product. It has been a marathon effort for the last few months, and having a great team made all the work feel like a refreshing summer breeze.

The release note I concocted should say all you need to know about the product. A major thanks to everyone who helped inside and outside WSO2 to make the final release a reality.

 

WSO2 Business Activity Monitor 2.0.0 released!

The WSO2 Business Activity Monitor (WSO2 BAM) is an enterprise-readyfully-open sourcecomplete solution for aggregating, analyzing and presenting information about business activities. The aggregation refers to collection of data, analysis refers to manipulation of data in order to extract information, and presentation refers to representing this data visually or in other ways such as alerts. The WSO2 BAM architecture reflects this natural flow in its design.

Since all WSO2 products are based on the component-based WSO2 Carbon platform, WSO2 BAM is lean, lightweight and consists of only the required components for efficient functioning. It does not contain unnecessary bulk, unlike many over-bloated, proprietary solutions. WSO2 BAM comprises of only required modules to give the best of performance, scalability and customizability, allowing businesses to achieve time-effective results for their solutions without sacrificing performance or the ability to scale.

The product is available for download at: http://wso2.com/products/business-activity-monitor

The documentation is available at: http://docs.wso2.org/wiki/display/BAM200/WSO2+Business+Activity+Monitor+Documentation

Key Features

  • Collect & Store any Type of Business Events

    • Events are named, versioned and typed by event source
    • Event structure consists of (name, value) tuples of business data, metadata and correlation data
  • High Performance Data Capture Framework

    • High performance, low latency API for receiving large volumes of business events over various transports including Apache Thrift, REST, HTTP and Web services
    • Scalable event storage into Apache Cassandra using columns families per event type
    • Non-blocking, multi-threaded, low impact Java Agent SDK for publishing events from any Java based system
    • Use of Thrift, HTTP and Web services allows event publishing from any language or platform
    • Horizontally scalable with load balancing and high available deployment
  • Pre-Built Data Agents for all WSO2 Products

  • Scalable Data Analysis Powered by Apache Hadoop

    • SQL-like flexibility for writing analysis algorithms via Apache Hive
    • Extensibility via analysis algorithms implemented in Java
    • Schedulable analysis tasks
    • Results from analysis can be stored flexibly, including in Apache Cassandra, a relational database or a file system
  • Powerful Dashboards and Reports

    • Tools for creating customized dashboards with zero code
    • Ability to write arbitrary dashboards powered by Google Gadgets and {JaggeryJS}
  • Installable Toolboxes

    • Installable artifacts to cover complete use cases
    • One click install to deploy all artifacts for a use case
Issues Fixed in This Release
All fixed issues have been recorded at – http://bit.ly/Tzb1VP
Known Issues in This Release
All known issues have been recorded at – http://bit.ly/TzberZ

Engaging with Community

Mailing Lists

Join our mailing list and correspond with the developers directly.

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting issues.

Discussion Forums

We encourage you to use stackoverflow (with the wso2 tag) to engage with developers as well as other users.

Training

WSO2 Inc. offers a variety of professional Training Programs, including training on general Web services as well as WSO2 Business Activity Monitor and number of other products. For additional support information please refer to http://wso2.com/training/

Support

We are committed to ensuring that your enterprise middleware deployment is completely supported from evaluation to production. Our unique approach ensures that all support leverages our open development methodology and is provided by the very same engineers who build the technology.

For additional support information please refer tohttp://wso2.com/support/

For more information on WSO2 BAM, and other products from WSO2, visit the WSO2 website.


We welcome your feedback and would love to hear your thoughts on this release of WSO2 BAM.

The WSO2 BAM Development Team

 

WSO2 BAM 2.0.0-Alpha 2 released!

My team at WSO2 was able to release a 2nd alpha of our upcoming BAM 2.0. Do give it a spin.

The release note is below:

The WSO2 team is pleased to announce the release of version 2.0.0 – ALPHA 2 of WSO2 Business Activity Monitor.

WSO2 Business Activity Monitor (WSO2 BAM) is a comprehensive framework designed to solve the problems in the wide area of business activity monitoring. WSO2 BAM comprises of many modules to give the best of performance, scalability and customizability. These allow to achieve requirements of business users, dev ops, CxOs without spending countless months on customizing the solution without sacrificing performance or the ability to scale.

WSO2 BAM is powered by WSO2 Carbon, the SOA middleware component platform.

Downloads

The binary distribution can be downloaded at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2.zip.

The documentation pack is available at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2-docs.zip.

Samples
  1. Service Data Agent – Sample to install Service data agent, publish statistics and intercepted message activity from Service Hosting WSO2 Servers such as WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature
  2. Mediation Data Agent – Sample to install Mediation data agent, publish mediation statistics and intercepted message activity using Message Activity Mediators from the WSO2 ESB
  3. Data center wide cluster monitoring – Sample to simulate two data centers each having two clusters sending statistics events, perform summarizations and visualize them in a dashboard
  4. End – End Message Tracing – Sample to simulate messages fired from a set of servers to WSO2 BAM and set up message tracing analytics and visualizations of respective messages
  5. KPI Definition – Sample to simulate receiving events from a server (ex: WSO2 AS), perform summarizations and visualize product and consumer data in a retail store
  6. Fault Detection & Alerting – Sample to simulate receiving events from a server (ex: WSO2 ESB), detect faults and fire email alerts

Features

  • Data Agents
    1. Pre built data agents – Service Data Agent for the WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature and Mediation Data Agent for the WSO2 ESB
    2. A re-usable Agent API to publish events to the BAM server from any application (samples included)
    3. Apache Thrift based Agents to publish data at extremely high throughput rates
    4. Option to use Binary or HTTP protocols
  • Event Storage
    1. Apache Cassandra based scalable data architecture for high throughput of writes and reads
    2. Carbon based security mechanism on top of Cassandra
  • Analytics
    1. An Analyzer Framework with the capability of writing and plugging in any custom analysis tasks
    2. Built in Analyzers for common operations such as get, put aggregate, alert, fault detection, etc.
    3. Scheduling capability of analysis tasks
  • Visualization
    1. Drag and drop gadget IDE to visualize analyzed data with zero code
    2. Capability to plug in additional UI elements and Data sources to Gadget IDE
    3. Google gadgets based dashboard

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting any of these.

A revolution with Business Activity Monitor (BAM) 2.0

Producing middle ware that is both lean and enterprise worthy is a difficult job. It’s either non-existent or requires innovative thinking (a lot of it) and a lot of going back and forth with your implementations. Very risky business, but if you get it right, it puts you far ahead of anyone else. It’s why we thought of re-writing  WSO2 BAM from scratch and taking a leap rather than chugging away slowly by iterative fixing. If you prefer to hear me rather than reading this, please catch a webinar on this at http://bit.ly/xKxm8R.

Diagram coutesy of http://softwarecreation.org/2008/ideas-in-software-development-revolution-vs-evolution-part-1/

When you try to monitor your business activities, you need to plug in to your servers and capture events. It sounds easy enough, so what’s the big deal? you may ask. Here’s a few road blocks we hit with our intial BAM 1.x version:

  • Performance – We plug in to our ESBs and App Servers and all metrics were perfect. It nicely showed request counts, response times, etc. It was perfect as long as the load is low. If one server starts sending 1000 events/sec, things started getting ugly. Even worse, if we plug in to a few servers and start getting 1 billion events / day, well, that would have been a nightmare from the word go. We couldn’t even fathom what would happen at that type of scale.
  • Scalability – We need to store events and process them. Sadly, we discovered the hard waye this would mean is we need to scale in many different ways.
    • Event load – We need to scale in terms oh handling large amounts of events. We didn’t have a high performance server, but no matter how good our performance would be, there is still a breaking point. Afterwards, you need to scale.
    • Storage – If you store 1000 events a day, your data will grow. And, all of us hate to delete off old email, to get more inbox space. So naturally, everyone wants to keep their events.
    • Processing power – When you want to analyze events that you collect, a single server can only give you that much of processing power. You need to scale out your analytics. Another, ‘oh, so obvious’ thing that we learnt eventually.
  • Customizability – We provided a lovely set of dashboards that showed all you wanted to know about your server and API metrics. But, no one is ever satisfied with what we they have. They want more. They want to monitor their metrics and analyze their data and put up their own graphs. And, of course, they want to do it now, not in 2 months.

 

In May 2011, we decided to start a whole new initiative to re-write WSO2 BAM from scratch. We analyzed the problem made a few decisions. Here’s a few of them.

  • Divide and conquer – We divided the problem. We have to aggregate, analyze and present data. So we built separate components for each, keeping in mind that we need to scale each individually. We mapped these into the event receiver, analyzer framework and a presentation layer. Data agents are the link between anyone who wants to send events and the BAM server. The WSO2 Carbon platform, allows us to easily uninstall a component from any server. This means we can take the BAM distro, uninstall other components just to make an Event Receiver BAM server. Or to make an Analyzer BAM server. It’s just a click of a button.
The 3 main components of BAM 2.0
  • Scalable and fast storage – We chose to use Apache Cassandra as our storage solution. I do not want to argue that it’s the best data store ever. But, it works for us well. It allows us to do fast writes to store a large amount of data, quickly. Also, it’s built to scale. Scaling up Cassandra, takes minutes, not weeks. And scaling up doesn’t mean it’s going to cost you. Also, it’s written in Java, and being a Java house, it allows us to hack around the code.
  • Fast protocol – We chose to use Apache Thrift as our default protocol. There are many arguments against it, but it holds up well for us. It’s fast and it does it’s job. It allows us to maintain sessions, supports a bunch of languages. One key thing was Cassandra uses it as well, allowing us to gain more performance in streaming data into Cassandra without deserializing.
  • Scalable analytics – We chose to write our own analytics language. But, if it doesn’t suit you, you can plugin your own java code. Hadoop is unavoidable when it comes to scaling analytics. So, we decided to have a Hadoop mode for large amounts of data and a non-Hadoop mode, so that anyone can just use BAM without worrying about any Hadoop cluster.

  • Gadget based dashboards/reports – Drag and drop visualizations are very attractive when you don’t want to spend weeks writing code to visualize. We developed a gadget generator so you can quickly visualize your analyzed data easily.

After a couple of milestones, we were able to spin off an alpha. It’s available here: http://dist.wso2.org/products/bam/2.0.0-Alpha/wso2bam-2.0.0-ALPHA.zip. It is not the silver bullet and documentation is still WIP. But, if we haven’t already reached our destination, it’s within our reach now.

 

WSO2 Business Activity Monitor 2.0.0 Alpha released!

After a lot of re-designing, re-architecting, re-writing, re-re-writing we have come with an alpha of the all new BAM 2. Although, this is still an alpha, it will provide a good taste of things to come in the major BAM release.

Here’s the (not so) official release note:

WSO2 Business Activity Monitor (BAM) 2.0.0-Alpha is now available for download at [1].
The 2.0.0 alpha version is a complete re-write of BAM concentrating on scalability, performance and customizability.
Samples

This release contains samples that can be run without setting up another server to send events to the BAM server.
  1. KPI Definition – Sample to simulate receiving events from a server (ex: WSO2 AS), perform summarizations and visualize product and consumer data in a retail store
  2. Fault Detection & Alerting – Sample to simulate receiving events from a server (ex: WSO2 ESB), detect faults and fire email alerts
Features

Data Agents
  1. A re-usable Agent API to publish events to the BAM server from any application (samples included)
  2. Apache Thrift based Agents to publish data at extremely high throughput rates
  3. Option to use Binary or HTTP protocols
Event Storage
  1. Apache Cassandra based scalable data architecture for high throughput of writes and reads
  2. Carbon based security mechanism on top of Cassandra
Analytics
  1. An Analyzer Framework with the capability of writing and plugging in any custom analysis tasks
  2. Built in Analyzers for common operations such as get, put aggregate, alert, fault detection, etc.
  3. Scheduling capability of analysis tasks
Visualization
  1. Drag and drop gadget IDE to visualize analyzed data with zero code
  2. Capability to plug in additional UI elements and Data sources to Gadget IDE
  3. Google gadgets based dashboard

We welcome to use this and provide feed back ahead of the major release in Q1/Q2 2012.
Keys available at [2], [3].