CSV Connector for WSO2 BAM

I’ve worked on a small tool to publish your spreadsheets (after converting to CSVs of course!) to WSO2 BAM.

The cool thing is you can publish 1 (or 1000s) of spread sheets to WSO2 BAM, and use the HiveAnalytics UI to slice and dice them to produce neat results.

So, you need maven to build and run this (and of course, WSO2 BAM up and running). Here are the steps:

1. Download and unzip the source from this link.

2. Run ‘mvn clean install’ at the unzipped location.

3. Now run the exec command in maven as per the following example: ‘mvn exec:java -Dexec.mainClass=org.wso2.carbon.bam.CSVAgent -DcsvFile=../ExportCustomerAccounts.csv -DstreamName=CustomerAccounts -DstreamVersion=1.2.0’

Here is what happens:

“CustomerAccounts” is the stream that will get created out of the CSV file, “ExportCustomerAccounts.csv”. All streams are versioned in BAM, so this stream will have the version “1.2.0”. Versioning means you can publish different versions of the CSV (columns deleted or added) with different versions.

The potential of this is you can publish any number of CSVs to BAM and make use of the SQL-like Hive query language to do joins and group bys to get valuable information out of your spread sheets.

Best Practices in the WSO2 Carbon Platform

 

This post discusses best practices when programming with the WSO2 Carbon platform, which is the base for all WSO2 products.

Here are the main points discussed in this post:

  1. Do not hardcode compile time, run time dependencies and re-use properties in the root pom.
  2. Use OSGI Declarative Services
  3. Do not copy paste code, re-use code through OSGI, Util methods, etc.
  4. Understand Multi tenancy and design for Multi Tenancy
  5. Write tests for your code

 

These points are discussed in detail below giving reasons and a HOWTO for each point. Hope you find the details useful as this is a long (and probably boring) read.

  1. Do not hardcode compile time, run time dependencies and re-use properties in the root pom.
Ex:
       <dependency>
            <groupId>org.json</groupId>
            <artifactId>json</artifactId>
            <version>1.0.0.wso2v1</version>
        </dependency>
Why do this?
This should be avoided as it threatens the stability of the build. If two versions of the same jar comes into a product, it can cause OSGI related errors, that can take some time to identify and fix.
How to avoid this?
Make sure the root pom (components/pom.xml or platform/pom.xml) has a dependency version defined, and use that.
Ex: in platform/pom.xml
<orbit.version.json>2.0.0.wso2v1</orbit.version.json>
If it is not defined, please define it in the root pom and use this version.
 
 
 
2. Use OSGI declarative services
Do not get service references in the activate method.
ServiceReference serviceReference = componentContext.getBundleContext().getServiceReference(Foo.class.getName());
        if(serviceReference != null){
            Foo = (Foo) componentContext.getBundleContext().getService(serviceReference);
        }
Why do this?
(Quoting, Pradeep here) This can lead to erroneous situations and you will have to check whether services are available in a while loop to make it work properly. And it becomes complicated when two or more service references.
Why bother to do all this when the DS framework handles all this for you.
How to avoid this?
Use a DS reference in your service component.
Ex:
 * @scr.reference name=”foo.comp”
 * interface=”org.wso2.carbon.component.Foo”
 * cardinality=”1..1″ policy=”dynamic” bind=”setFoo”  unbind=”unsetFoo”
protected void setFoo(Foo foo) {
// do whatever ex:
        manager.setFoo(foo);
}
protected void unsetFoo(Foo foo) {
// make sure lose the reference
        manager.setFoo(null);
}
3. Do not copy paste code, re-use code through OSGI, Util methods, etc.
 
If you need functionality provided by other components, don’t copy paste code. Find another way to re-use the code. If you need some common thing done most probably, there exists a util method to that, or an osgi method. If not add or create one.
Why do this?
 
It might take some effort and discipline but later on you (or someone else) will have to write less code. If changes happen to the original code, you will have to fix the copy pasted code as well (which is often missed).
How to do this?
 
i. Use util methods/ constants
Easily said with an example. Ex: To split domain name from user name use, MultitenantUtils.getTenantDomain(username);
Same applies for constants. Ex: MultitenantConstants.SUPER_TENANT_DOMAIN
ii. Use and expose OSGI services
You can simply expose any class you want by registering an OSGI service,
ex: In the bundle activator,
context.getBundleContext().
                        registerService(Foo.class.getName(), new Foo(), null);
Get a reference and re-use as pointed in point 2.
4. Understand Multi tenancy and design for Multi Tenancy
 
I feel that some folks don’t understand what multi tenancy (MT) means.  It is an important aspect of the platform and it should not be an after thought, but a part of the design.
Why do this?
 
Making code work for multiple tenants needs some careful design. It may not be straight forward for some cases. So thinking about it after a release or when you want to make the code work for MT may require some heavy refactoring. Now with the products and services merged, multi tenancy should not be separate at all.
How to do this?
 
This is an extensive topic so I will not go into details.
Using AxisConfigurationContextObserver, Tenant aware registries are some easy ways provided by the platform. If you are depending on a non-MT dependency, you will have to figure out how to make it work in the MT case. You can always get help from other folks who have done MT aware stuff.
 
5. Write tests for your code
Make sure you write tests for your code and gain a good % of code coverage. Folks will not know whether changes will break functionality or not until it is too late.
Why do this?
 
The reasons are obvious and have been stated by many. But to re-iterate, this makes the code base extremely stable. Other folks can change your code to fix bugs or do enhancements without worrying about breaking functionality or actually breaking functionality.
How to do this?
 

I personally prefer unit tests. But we have an integration test framework and as well as a system test framework (Clarity). Make sure you have tests to address to cover most functionality, if not all functionality. Features should not be considered complete, without test coverage.

 

If you find improvements on the points spoken, please do leave a comment and I will incorporate it into the post.

Introductory webinar on the open source WSO2 Business Activity Monitor 2.0.0

Link to register : http://wso2.org/library/webinars/2012/09/introducing-all-new-wso2-bam2-all-your-business-monitoring-needs/

I will be doing an introductory webinar on the recently released WSO2 BAM 2.0.0 at the following times on Wednesday, 19th September, 2012:

  • 09:00 AM – 10:00 AM (PDT)
  • 10:00 AM – 11:00 AM (GMT)

In that I will be two quick demos based on,

  • Defining custom KPIs and Analytics based on data from an iPhone App
  • Monitoring the WSO2 servers through the Service Stats toolbox

Here is the official content of the webinar:

In a webinar conducted earlier this year, we presented a preview of WSO2 BAM2 which is a complete re-write of the BAM 1.x versions. We explained how WSO2 BAM2 addresses the requirements of customization, scalability and performance based on NoSQL data storage, super fast data transfer rates, configuration based analytics, and WYSIWIG UI development tools.

Today WSO2 BAM2 is available with a whole new set of features and capabilities such as

  • Collecting and Storing any type of business data
  • High Performance Data Capture Framework with a REST API
  • Pre-Built Data Agents for all WSO2 Products
  • SQL like analytics language
  • Scalablea analytics based on Hadoop
  • Dashboard and Reporting capabilities

and more..

Join Tharindu Mathew on this webinar as he takes you through the enhanced features and capabilities of WSO2 BAM2 and demonstrates how they can be applied in common business scenarios such as,

  • SOA server monitoring
  • ESB monitoring
  • Custom KPI Definition and Monitoring

 

WSO2 BAM 2.0.0 released!

 

The screenshots above show the final result of a Service statistics monitoring use case. Data across many servers got published to BAM, had to be analyzed and then presented on the dashboard you see above. Nothing better than a cool dashboard to make sense of all that data 😉

It has been an enduring journey with an abundance of learning curves that allows the BAM team to make some great technologies work together seamlessly. After, spending almost an year on a complete re-write of the WSO2 Business Activity Monitor, we were able to put the 2.0.0 release of this product, which is a complete re-write of the 1.x product. It has been a marathon effort for the last few months, and having a great team made all the work feel like a refreshing summer breeze.

The release note I concocted should say all you need to know about the product. A major thanks to everyone who helped inside and outside WSO2 to make the final release a reality.

 

WSO2 Business Activity Monitor 2.0.0 released!

The WSO2 Business Activity Monitor (WSO2 BAM) is an enterprise-readyfully-open sourcecomplete solution for aggregating, analyzing and presenting information about business activities. The aggregation refers to collection of data, analysis refers to manipulation of data in order to extract information, and presentation refers to representing this data visually or in other ways such as alerts. The WSO2 BAM architecture reflects this natural flow in its design.

Since all WSO2 products are based on the component-based WSO2 Carbon platform, WSO2 BAM is lean, lightweight and consists of only the required components for efficient functioning. It does not contain unnecessary bulk, unlike many over-bloated, proprietary solutions. WSO2 BAM comprises of only required modules to give the best of performance, scalability and customizability, allowing businesses to achieve time-effective results for their solutions without sacrificing performance or the ability to scale.

The product is available for download at: http://wso2.com/products/business-activity-monitor

The documentation is available at: http://docs.wso2.org/wiki/display/BAM200/WSO2+Business+Activity+Monitor+Documentation

Key Features

  • Collect & Store any Type of Business Events

    • Events are named, versioned and typed by event source
    • Event structure consists of (name, value) tuples of business data, metadata and correlation data
  • High Performance Data Capture Framework

    • High performance, low latency API for receiving large volumes of business events over various transports including Apache Thrift, REST, HTTP and Web services
    • Scalable event storage into Apache Cassandra using columns families per event type
    • Non-blocking, multi-threaded, low impact Java Agent SDK for publishing events from any Java based system
    • Use of Thrift, HTTP and Web services allows event publishing from any language or platform
    • Horizontally scalable with load balancing and high available deployment
  • Pre-Built Data Agents for all WSO2 Products

  • Scalable Data Analysis Powered by Apache Hadoop

    • SQL-like flexibility for writing analysis algorithms via Apache Hive
    • Extensibility via analysis algorithms implemented in Java
    • Schedulable analysis tasks
    • Results from analysis can be stored flexibly, including in Apache Cassandra, a relational database or a file system
  • Powerful Dashboards and Reports

    • Tools for creating customized dashboards with zero code
    • Ability to write arbitrary dashboards powered by Google Gadgets and {JaggeryJS}
  • Installable Toolboxes

    • Installable artifacts to cover complete use cases
    • One click install to deploy all artifacts for a use case
Issues Fixed in This Release
All fixed issues have been recorded at – http://bit.ly/Tzb1VP
Known Issues in This Release
All known issues have been recorded at – http://bit.ly/TzberZ

Engaging with Community

Mailing Lists

Join our mailing list and correspond with the developers directly.

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting issues.

Discussion Forums

We encourage you to use stackoverflow (with the wso2 tag) to engage with developers as well as other users.

Training

WSO2 Inc. offers a variety of professional Training Programs, including training on general Web services as well as WSO2 Business Activity Monitor and number of other products. For additional support information please refer to http://wso2.com/training/

Support

We are committed to ensuring that your enterprise middleware deployment is completely supported from evaluation to production. Our unique approach ensures that all support leverages our open development methodology and is provided by the very same engineers who build the technology.

For additional support information please refer tohttp://wso2.com/support/

For more information on WSO2 BAM, and other products from WSO2, visit the WSO2 website.


We welcome your feedback and would love to hear your thoughts on this release of WSO2 BAM.

The WSO2 BAM Development Team

 

WSO2 BAM 2.0.0-Alpha 2 released!

My team at WSO2 was able to release a 2nd alpha of our upcoming BAM 2.0. Do give it a spin.

The release note is below:

The WSO2 team is pleased to announce the release of version 2.0.0 – ALPHA 2 of WSO2 Business Activity Monitor.

WSO2 Business Activity Monitor (WSO2 BAM) is a comprehensive framework designed to solve the problems in the wide area of business activity monitoring. WSO2 BAM comprises of many modules to give the best of performance, scalability and customizability. These allow to achieve requirements of business users, dev ops, CxOs without spending countless months on customizing the solution without sacrificing performance or the ability to scale.

WSO2 BAM is powered by WSO2 Carbon, the SOA middleware component platform.

Downloads

The binary distribution can be downloaded at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2.zip.

The documentation pack is available at http://dist.wso2.org/products/bam/2.0.0-alpha2/wso2bam-2.0.0-ALPHA2-docs.zip.

Samples
  1. Service Data Agent – Sample to install Service data agent, publish statistics and intercepted message activity from Service Hosting WSO2 Servers such as WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature
  2. Mediation Data Agent – Sample to install Mediation data agent, publish mediation statistics and intercepted message activity using Message Activity Mediators from the WSO2 ESB
  3. Data center wide cluster monitoring – Sample to simulate two data centers each having two clusters sending statistics events, perform summarizations and visualize them in a dashboard
  4. End – End Message Tracing – Sample to simulate messages fired from a set of servers to WSO2 BAM and set up message tracing analytics and visualizations of respective messages
  5. KPI Definition – Sample to simulate receiving events from a server (ex: WSO2 AS), perform summarizations and visualize product and consumer data in a retail store
  6. Fault Detection & Alerting – Sample to simulate receiving events from a server (ex: WSO2 ESB), detect faults and fire email alerts

Features

  • Data Agents
    1. Pre built data agents – Service Data Agent for the WSO2 AS, DSS, BPS, CEP, BRS and any other WSO2 Carbon server with the service hosting feature and Mediation Data Agent for the WSO2 ESB
    2. A re-usable Agent API to publish events to the BAM server from any application (samples included)
    3. Apache Thrift based Agents to publish data at extremely high throughput rates
    4. Option to use Binary or HTTP protocols
  • Event Storage
    1. Apache Cassandra based scalable data architecture for high throughput of writes and reads
    2. Carbon based security mechanism on top of Cassandra
  • Analytics
    1. An Analyzer Framework with the capability of writing and plugging in any custom analysis tasks
    2. Built in Analyzers for common operations such as get, put aggregate, alert, fault detection, etc.
    3. Scheduling capability of analysis tasks
  • Visualization
    1. Drag and drop gadget IDE to visualize analyzed data with zero code
    2. Capability to plug in additional UI elements and Data sources to Gadget IDE
    3. Google gadgets based dashboard

Reporting Issues

WSO2 encourages you to report issues, enhancements and feature requests for WSO2 BAM. Use the issue tracker for reporting any of these.

A revolution with Business Activity Monitor (BAM) 2.0

Producing middle ware that is both lean and enterprise worthy is a difficult job. It’s either non-existent or requires innovative thinking (a lot of it) and a lot of going back and forth with your implementations. Very risky business, but if you get it right, it puts you far ahead of anyone else. It’s why we thought of re-writing  WSO2 BAM from scratch and taking a leap rather than chugging away slowly by iterative fixing. If you prefer to hear me rather than reading this, please catch a webinar on this at http://bit.ly/xKxm8R.

Diagram coutesy of http://softwarecreation.org/2008/ideas-in-software-development-revolution-vs-evolution-part-1/

When you try to monitor your business activities, you need to plug in to your servers and capture events. It sounds easy enough, so what’s the big deal? you may ask. Here’s a few road blocks we hit with our intial BAM 1.x version:

  • Performance – We plug in to our ESBs and App Servers and all metrics were perfect. It nicely showed request counts, response times, etc. It was perfect as long as the load is low. If one server starts sending 1000 events/sec, things started getting ugly. Even worse, if we plug in to a few servers and start getting 1 billion events / day, well, that would have been a nightmare from the word go. We couldn’t even fathom what would happen at that type of scale.
  • Scalability – We need to store events and process them. Sadly, we discovered the hard waye this would mean is we need to scale in many different ways.
    • Event load – We need to scale in terms oh handling large amounts of events. We didn’t have a high performance server, but no matter how good our performance would be, there is still a breaking point. Afterwards, you need to scale.
    • Storage – If you store 1000 events a day, your data will grow. And, all of us hate to delete off old email, to get more inbox space. So naturally, everyone wants to keep their events.
    • Processing power – When you want to analyze events that you collect, a single server can only give you that much of processing power. You need to scale out your analytics. Another, ‘oh, so obvious’ thing that we learnt eventually.
  • Customizability – We provided a lovely set of dashboards that showed all you wanted to know about your server and API metrics. But, no one is ever satisfied with what we they have. They want more. They want to monitor their metrics and analyze their data and put up their own graphs. And, of course, they want to do it now, not in 2 months.

 

In May 2011, we decided to start a whole new initiative to re-write WSO2 BAM from scratch. We analyzed the problem made a few decisions. Here’s a few of them.

  • Divide and conquer – We divided the problem. We have to aggregate, analyze and present data. So we built separate components for each, keeping in mind that we need to scale each individually. We mapped these into the event receiver, analyzer framework and a presentation layer. Data agents are the link between anyone who wants to send events and the BAM server. The WSO2 Carbon platform, allows us to easily uninstall a component from any server. This means we can take the BAM distro, uninstall other components just to make an Event Receiver BAM server. Or to make an Analyzer BAM server. It’s just a click of a button.
The 3 main components of BAM 2.0
  • Scalable and fast storage – We chose to use Apache Cassandra as our storage solution. I do not want to argue that it’s the best data store ever. But, it works for us well. It allows us to do fast writes to store a large amount of data, quickly. Also, it’s built to scale. Scaling up Cassandra, takes minutes, not weeks. And scaling up doesn’t mean it’s going to cost you. Also, it’s written in Java, and being a Java house, it allows us to hack around the code.
  • Fast protocol – We chose to use Apache Thrift as our default protocol. There are many arguments against it, but it holds up well for us. It’s fast and it does it’s job. It allows us to maintain sessions, supports a bunch of languages. One key thing was Cassandra uses it as well, allowing us to gain more performance in streaming data into Cassandra without deserializing.
  • Scalable analytics – We chose to write our own analytics language. But, if it doesn’t suit you, you can plugin your own java code. Hadoop is unavoidable when it comes to scaling analytics. So, we decided to have a Hadoop mode for large amounts of data and a non-Hadoop mode, so that anyone can just use BAM without worrying about any Hadoop cluster.

  • Gadget based dashboards/reports – Drag and drop visualizations are very attractive when you don’t want to spend weeks writing code to visualize. We developed a gadget generator so you can quickly visualize your analyzed data easily.

After a couple of milestones, we were able to spin off an alpha. It’s available here: http://dist.wso2.org/products/bam/2.0.0-Alpha/wso2bam-2.0.0-ALPHA.zip. It is not the silver bullet and documentation is still WIP. But, if we haven’t already reached our destination, it’s within our reach now.