Analyzers

With BAM you can write your own analyzer sequence such that it analyzes the data collected by BAM. The analyzer provides you multiple ways to write your analyzer sequence and get your required information out of the collected bunch of data by BAM. You can have multiple operations according to your requirement such as get, aggregate, order, failure, alert, etc. And furthermore, the analyzer can be scheduled to occur in customized time instances such as once a day, once a minute, every Wednesday at 12.15, every 30th at 12 midnight. There are many types of analyzers provided by BAM, to cater the all user requirements.

  1. Get Analyzer
  2. Put Analyzer
  3. GroupBy Analyzer
  4. OrderBy Analyzer
  5. Extract Analyzer
  6. Drop Analyzer
  7. Fault Detection Analyzer
  8. Alert Trigger

Get Analyzer

This analyzer retrieves data from an index. Therefore, before using the get analyzer an index should be created. More information about creating an index is available at analyzer-framework.html

Syntax :

  <get name='' batchSize='integer'>
  [0..1]  <where index=''/>
   <range column='' start='' end=''/> [1..*]
   </where>
   </get>

Syntax Explanation :

  <get name='' batchSize='integer'>                       :name = the column family to
                                                                retrieve data from
                                                                batchSize = the number of
                                                                record to retrieve to memeory
                                                                at one instance
  [0..1]  <where index=''/>                               :index = the name of the index
                                                                to retrieve
   <range column='' start='' end=''/> [1..*]              :column = the column that was
                                                                indexed by the given index
                                                                :start = Starting value of a
                                                                range
                                                                :end = Ending value of a
                                                                range
                                                                NOTE: Both start and end
                                                                should be empty strings ("")
                                                                to retrieve all values for
                                                                the column family for the
                                                                index

   </where>
   </get>

Example :

  1. Example 1
    • Input : There is No input for the get analyzer.
    • Existing data in column family 'result':
    •     {ESB} : {{requestCount : 65}, {responseCount : 65}, {responseTime : 7.0} }
          {AS} : {{requestCount : 4}, {responseCount : 4}, {responseTime : 23.0} }
          {BAM} : {{requestCount : 56}, {responseCount : 56}, {responseTime : 35} }
          {BRS} : {{requestCount : 43}, {responseCount : 43}, {responseTime : 53.3} }
      
    • Get specification :
    •   <get name='result' batchSize='2'>
        [0..1]  <where index='resultIndex'/>
         <range column='requestCount' start='' end=''/>
         </where>
         </get>
      
      
    • Output : Retrieved data from the column family 'result'
    •     {ESB} : {{requestCount : 65}, {responseCount : 65}, {responseTime : 7.0} }
          {AS} : {{requestCount : 4}, {responseCount : 4}, {responseTime : 23.0} }
      
      
      
  2. Example 2
    • Input : There is No input for the get analyzer.
    • Existing data in column family 'result':
    •     {ESB} : {{requestCount : 65}, {responseCount : 65}, {responseTime : 7.0} }
          {AS} : {{requestCount : 4}, {responseCount : 4}, {responseTime : 23.0} }
          {BAM} : {{requestCount : 56}, {responseCount : 56}, {responseTime : 35} }
          {BRS} : {{requestCount : 43}, {responseCount : 43}, {responseTime : 53.3} }
      
    • Get specification :
    •   <get name='result' batchSize='100'>
        [0..1]  <where index='resultIndex'/>
         <range column='requestCount' start='AS' end='BRS'/>
         </where>
         </get>
      
      
    • Output : Retrieved data from the column family 'result'
    •     {AS} : {{requestCount : 4}, {responseCount : 4}, {responseTime : 23.0} }
          {BAM} : {{requestCount : 56}, {responseCount : 56}, {responseTime : 35} }
          {BRS} : {{requestCount : 43}, {responseCount : 43}, {responseTime : 53.3} }
      
      
      

Put Analyzer

This analyzer stores data to the a given column family. Currently storing grouped rows is not supported. Only list of rows can be stored as of now. Additionally it can be defined what to be done in case the row with given key already exists. The behaviour is either to replace the entire row or aggregate specified fields of existing row with the new row being stored.

Syntax :

  <put name='' dataSource=''>
     (0..1)<onExist>
              <replace/>
              <aggregate>
                 +<measure name='' aggregationType=''/>
              </aggregate>
           </onExist>
  </put>

Syntax Explanation :

 <put name='' dataSource=''>                              : name = the column family to store
                                                            dataSource = the data source this
                                                            table belongs to.
    (0..1)<onExist>                                       : specifies the behaviour if row with the
                                                            given key already exists. If not present
                                                            the default behaviour is to replace the
                                                            row.
             <replace/>                                   : replace the row. This is the default behaviour.
             <aggregate>                                  : aggregate given fields of existing and new row and store.
                +<measure name='' aggregationType=''/>    : the field to be aggregated along with the type of
                                                            aggregation. (SUM, MIN, MAX, AVG, CUM). (The other
                                                            fields will get replaced if existing or newly added
                                                            to the stored row.)
             </aggregate>
          </onExist>
 </put>

Example :

GroupBy Analyzer

This analyzer groups the rows according the values of the columns specified by 'field' or 'time'. The result is a map with each unique concatenated value of 'field' columns as key and the list of rows having this unique values in their respective columns, as map value. 'time' is a special field containing a date time whose value can be rounded up to given granularity (e.g. : hour) during the grouping.

Syntax :

  <groupBy>
              +<field name=''/>
              (0..1)<time name='' granularity='minute,hour,day,month,year'/>
           </groupBy>

Example :

  1. Example 1:
    • Input : List of rows as given below
    •    {employee1} : {{name : ben}, {age : 42}, {post : eng}, {dept : E1}, {joinDate : 2011-01-21} }
         {employee2} : {{name : alex}, {age : 35}, {post : admin}, {dept : A1}, {joinDate : 2011-03-24} }
         {employee3} : {{name : bob}, {age : 48}, {post : eng}, {dept : E1}, {joinDate : 2011-01-22} }
         {employee4} : {{name : sarah}, {age : 26}, {post : admin}, {dept : A2 }, {joinDate : 2008-02-28} }
         {employee5} : {{name : ben}, {age : 42}, {post : eng} , {dept : E2}, {joinDate : 2010-07-23} }
      
      
    • GroupBy Specification :
    •          <groupBy>
                  <field name='dept'/>
                  <field name='post'/>
                  <time name='joinDate' granularity='month'/>
               </groupBy>
      
      
    • Output : Rows grouped by each unique concatenated value of columns 'dept', 'post' and 'joinDate'. Values of 'joinDate' column has been rounded to months as defined by 'granularity'.
    •    [E1---eng---2011-01]   : {employee1} : {{name : ben}, {age : 42}, {post : eng}, {dept : E1}, {joinDate : 2011-01-21} }
                                  {employee3} : {{name : bob}, {age : 48}, {post : eng}, {dept : E1}, {joinDate : 2011-01-22} }
         [E2---eng---2010-07]   : {employee5} : {{name : ben}, {age : 42}, {post : eng} , {dept : E2}, {joinDate : 2010-07-23} }
         [A1---admin---2011-03] : {employee2} : {{name : alex}, {age : 35}, {post : admin}, {dept : A1}, {joinDate : 2011-03-24} }
         [A2---admin---2008-02] : {employee4} : {{name : sarah}, {age : 26}, {post : admin}, {dept : A2 }, {joinDate : 2008-02-28} }
      
      
  2. Example 2:
  3. When the input it self is grouped rows new groupings will be formed using each. unique value of the 'field' in each of rows found in the groups.
    • Input : List of rows as given below
    •    [30-plus]      : {employee1} : {{name : ben}, {age : 42}, {post : eng}, {dept : E1}, {joinDate : 2011-01-21} }
                          {employee2} : {{name : alex}, {age : 35}, {post : admin}, {dept : A1}, {joinDate : 2011-03-24} }
                          {employee3} : {{name : bob}, {age : 48}, {post : eng}, {dept : E1}, {joinDate : 2011-01-22} }
                          {employee5} : {{name : ben}, {age : 42}, {post : eng} , {dept : E2}, {joinDate : 2010-07-23} }
         [less-than-30] : {employee4} : {{name : sarah}, {age : 26}, {post : admin}, {dept : A2 }, {joinDate : 2008-02-28} }
      
      
    • GroupBy Specification :
    •          <groupBy>
                  <field name='dept'/>
                  <field name='post'/>
                  <time name='joinDate' granularity='month'/>
               </groupBy>
      
      
    • Output : Rows grouped by each unique concatenated value of columns 'dept', 'post' and 'joinDate'. Values of 'joinDate' column has been rounded to months as defined by 'granularity'. Old grouping not present.
    •     [E1---eng---2011-01]   : {employee1} : {{name : ben}, {age : 42}, {post : eng}, {dept : E1}, {joinDate : 2011-01-21}
                                   {employee3} : {{name : bob}, {age : 48}, {post : eng}, {dept : E1}, {joinDate : 2011-01-22} }
          [E2---eng---2010-07]   : {employee5} : {{name : ben}, {age : 42}, {post : eng} , {dept : E2}, {joinDate : 2010-07-23} }
          [A1---admin---2011-03] : {employee2} : {{name : alex}, {age : 35}, {post : admin}, {dept : A1}, {joinDate : 2011-03-24} }
          [A2---admin---2008-02] : {employee4} : {{name : sarah}, {age : 26}, {post : admin}, {dept : A2 }, {joinDate : 2008-02-28} }
      
      

OrderBy Analyzer

This analyzer orders the rows according the values of the column specified by 'field'. Ordering is done lexically.

Syntax :

 	<orderBy field=''>

Example :

  1. Example 1:
    • Input : List of rows as given below.
    •      {employee1} : {{name : ben}, {age : 42}, {post : eng} }
           {employee2} : {{name : alex}, {age : 35}, {post : admin} }
           {employee3} : {{name : bob}, {age : 48}, {post : eng} }
           {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
    • OrderBy Specification :
    •       <orderBy field='name'/>
      
    • Output : Rows ordered by each unique value of 'name'.
    •     {employee2} : {{name : alex}, {age : 35}, {post : admin} }
          {employee1} : {{name : ben}, {age : 42}, {post : eng} }
          {employee3} : {{name : bob}, {age : 48}, {post : eng} }
          {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
  2. Example 2:
  3. When the input is grouped rows ordering will happen within groups.
    • Input : List of grouped rows as given below.
    •     [30-plus]      : {employee1} : {{name : ben}, {age : 42}, {post : eng} }
                           {employee2} : {{name : alex}, {age : 35}, {post : admin} }
                           {employee3} : {{name : bob}, {age : 48}, {post : eng} }
          [less-than-30] : {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
    • OrderBy Specification :
    •       <orderBy field='name'/>
      
    • Output : Rows ordered by name within each group.
    •     [30-plus]      : {employee2} : {{name : alex}, {age : 35}, {post : admin} }
                           {employee1} : {{name : ben}, {age : 42}, {post : eng} }
                           {employee3} : {{name : bob}, {age : 48}, {post : eng} }
          [less-than-30] : {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      

Extract Analyzer

This analyzer creates new columns out of extracted values from xml content present in columns.

Syntax :

    	<extract>
   		+<field from='' name='' xpath=''>
       		+<namespace prefix='' uri=''/>
     		</field>
  	</extract>

Syntax Explanation :

  <extract>                        	 : Top level element for extract analyzer
  	<field from='' name='' xpath=''> : Specifies details about new column to extract out from xml content
                                        from = In which column xml content is present which is used
                                               to extract out content
                                        name = Name of newly formed column out of extracted value
                                        xpath = Xpath to extract content from the xml
 	<namespace prefix='' uri=''>     : Specifies a namespace used in xpath expression
  </extract>

Example :

Drop Analyzer

Drops groups, rows or columns from the data flow fulling the given criteria.

Syntax :

  <drop type="group|row|column">
       [fieldSet] || [groupSet]
  </drop>

  [fieldSet] := <fieldSet matchUsing='and|or'>
                   +<field name="" (0..1)regex=""/>
               </fieldSet>

  [groupSet] := <groupSet>
                   +<group regex=""/>
               </groupSet>

Syntax Explanation:

    <drop type="group|row|column">              : What type of data to drop is defined by 'type'.

    <fieldSet (0..1)matchUsing='and|or'>        : The attribute 'matchUsing' defines the semantics which should
                                                  be used to match. e.g.: 'and' will drop a row if all the
                                                  criteria defined by the one more fields defined in the
                                                  fieldSet is satisfied. Default is 'and'. This attribute
                                                  is not applicable for 'column' type.
        <field name="" (0..1)regex=""/>         : Defines the criteria to drop rows or columns. Specifying
    </fieldSet>                                   multiple fields will match all the fields. ('AND' semantics)
                                                     name : Drop row if this column exists or drop the column
                                                            from the row.
                                                     regex : Additionally match column value with regex. Drop
                                                             only regex is matched.

    <groupSet>
      <group regex=''/>                         : Defines the criteria to drop groups. Multiple groups to be
    </groupSet>                                   dropped can be specified.
                                                  regex : Match group name with regex. Drop group if group
                                                          name matches regex.

For 'group' type groupSet specification should be used and for 'row' or 'column' types fieldSet specification should be used.

Example :

  1. Example 1: Dropping rows.
    • Input : List of rows as given below.
    •        {employee1} : {{name : ben}, {age : 42}, {post : eng} }
             {employee2} : {{name : alex}, {age : 35}, {post : admin}, {id : EB233}} }
             {employee3} : {{name : bob}, {age : 48}, {post : eng} }
             {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
    • Drop Specification :
    • 	      <drop type='row'>
                          <fieldSet matchUsing='or'>
      	                    <field name='id'/>
      	                    <field name='post' regex='eng'/>
                         </fieldSet>
                    </drop>
      
      
    • Output : List of filtered rows as below with row matching filtering criteria dropped
    •       {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
  2. Example 2: Dropping columns.
    • Input : List of rows as given below.
    •        {employee1} : {{name : ben}, {age : 42}, {post : eng} }
             {employee2} : {{name : alex}, {age : 35}, {post : admin}, {id : EB233}} }
             {employee3} : {{name : bob}, {age : 48}, {post : eng} }
             {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
    • Drop Specification :
    • 	<drop type='column'>
                     <fieldSet>
                         <field name='id'/>
                         <field name='post' regex='eng*'/>
                      </fieldSet>
      	</drop>
      
    • Output : List of rows with filtered columns.
    •        {employee1} : {{name : ben}, {age : 42} }
             {employee2} : {{name : alex}, {age : 35}, {post : admin} }
             {employee3} : {{name : bob}, {age : 48} }
             {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
  3. Example 3: Dropping groups
    • Input : List of grouped rows.
    •        [30-plus]      : {employee1} : {{name : ben}, {age : 42}, {post : eng} }
                              {employee2} : {{name : alex}, {age : 35}, {post : admin} }
                              {employee3} : {{name : bob}, {age : 48}, {post : eng} }
             [less-than-30] : {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      
      
    • Drop Specification :
    • 		<drop type='group'>
                         <groupSet>
                                  <group regex='30-plus'/>
                         </groupSet>
                     </drop>
      
      
    • Output : List of filtered groups
    •      [less-than-30] : {employee4} : {{name : sarah}, {age : 26}, {post : admin} }
      
      

Error Detection Analyzer

This analyzer detects SOAP faults present on a specific message pay load. It is specific to the WSO2 ESB.

Syntax :

    <detectFault>
        <errorFields="" currentSequenceIdentifier=""/>
    </detectFault>

Syntax Explanation:

    <detectFault>
        <errorFields=""                  : errorFields = The error fields that should be passed
                                              down to an alert analyzer
        currentSequenceIdentifier=""/>   : currentSequenceIdentifier = The field of the ESB sequence

    </detectFault>
    errorFields

Example :

    • Input : List of rows as given below.
    •     {event1} : {{bam_error_code : 101000}, {bam_error_message : Sender IO error sending},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : in} }
          {event2} : {{bam_error_code : 101501}, {bam_error_message : Sender IO error receiving},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : out } }
           {event3} : {{name : alex}, {age : 35}, {post : admin} }
           {event4} : {{name : bob}, {age : 48} }
      
      
    • Detect Error Specification :
    •     <detectFault>
              <errorFields="bam_error_code, bam_error_message" currentSequenceIdentifier="bam_current_sequence"/>
          </detectFault>
      
      
    • Output : List of rows which has errors, and drop other events that doesn't have errors
    •      {event1} : {{bam_message_path : in -> out }, {bam_error_code : 101000}, {bam_error_message : Sender IO error sending},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : in} }
          {event2} : {{bam_message_path : in -> out }, {bam_error_code : 101501}, {bam_error_message : Sender IO error receiving},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : out } }
      
      

Alert Trigger

The alert trigger can be used to fire emails from a desired mail account. It will use the records present obtained from the previous analyzer and include it in a tabular format and fire the email.

Syntax :

    <alert to="TO_MAIL_ADDRESS" transport="smtps"
        subject="Fault notification" from="no-reply@wso2.org"
        mailhost="smtp.gmail.com" username="synapse.demo.0@gmail.com"
        password="mailpassword">
           <fields>bam_activity_id,operation_name,service_name,timestamp,faultCode,faultReason,bam_message_path</fields>
        </alert>

Syntax Explanation:

    <alert to="" transport=""                                    : to = address to send the mail to
                                                                  : transport = transport to use, ex: smtps
        subject="" from=""                                        : subject = subject of the mail address
                                                                  : from = mail address to set as the from address
        mailhost="" username=""                                   : mailhost = the mail host
                                                                  : username = the username of the mail account
        password="">                                             : password = password of mail account
    </alert>

Example :

    • Input : List of rows as given below.
    •      {event1} : {{bam_message_path : in -> out }, {bam_error_code : 101000}, {bam_error_message : Sender IO error sending},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : in} }
          {event2} : {{bam_message_path : in -> out }, {bam_error_code : 101501}, {bam_error_message : Sender IO error receiving},
                       {message_body :
                       "<soapenv:Body xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
                        <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope"><soapenv:Code>
                        <soapenv:Value>axis2ns1:Sender<soapenv:Value><soapenv:Code><soapenv:Reason>
                        <soapenv:Text xml:lang="en-US">Invalid value \"?\" for element in<soapenv:Text>
                        , {bam_current_sequence : out } }
      
    • Detect Error Specification :
    •     <alert to="TO_MAIL_ADDRESS" transport="smtps"
              subject="Fault notification" from="no-reply@wso2.org"
              mailhost="smtp.gmail.com" username="synapse.demo.0@gmail.com"
              password="mailpassword">
                 <fields>bam_activity_id,operation_name,service_name,timestamp,faultCode,faultReason,bam_message_path</fields>
              </alert>
      
      
    • Output : Mail sent according similar to the screenshot below:
    • sample email