Monday, October 4, 2010

Mondrian OLAP Engine

Mondrian is an OLAP engine written in Java. It executes queries written in the MDX language, reading data from a relational database (RDBMS), and presents the results in a multidimensional format via a Java API


Architecture



Reference : "http://mondrian.pentaho.com/documentation/architecture.php"

Deploy and run the web application with a non-embedded database

1. Install Tomcat (version 5.0.25 or later).
2. From the unzipped binary release, explode lib/mondrian.war to TOMCAT_HOME/webapps/mondrian
3. Create or open the mondrian.properties file in TOMCAT_HOME/webapps/mondrian and customize the mondrian.jdbcDrivers properties for the database you set up from the instructions above.
4. Open the web.xml file in TOMCAT_HOME/webapps/mondrian/WEB-INF and customize the two connect strings there to the same database parameters for the FoodMart database you installed as per the above instructions. That is,

Provider=mondrian;Jdbc=jdbc:odbc:MondrianFoodMart;Catalog=/WEB-INF/queries/FoodMart.xml;JdbcDrivers=sun.jdbc.odbc.JdbcOdbcDriver;

becomes

Provider=mondrian;Jdbc=jdbc:mysql://localhost/foodmart?user=foodmart&password=foodmart;Catalog=/WEB-INF/queries/FoodMart.xml;JdbcDrivers=com.mysql.jdbc.Driver;

Database compatibility

Mondrian is known to run on the following databases.

  1. Apache Derby (formerly known as Cloudscape)
  2. Firebird
  3. Greenplum
  4. HP Neoview
  5. Hypersonic (also known as hsqldb)
  6. IBM DB2
  7. Infobright
  8. Informix
  9. Ingres
  10. Interbase
  11. LucidDB
  12. Microsoft Access
  13. Microsoft SQL Server
  14. MySQL
  15. Netezza
  16. Oracle
  17. PostgreSQL (also known as Postgres)
  18. Sybase
  19. Teradata

Thursday, April 1, 2010

SpagoBI

SpagoBI is a Business Intelligence Free Platform, completely realized according to the free software philosophy. It is a suite of coordinated and integrated tools allowing the development of a specific BI solution in every business area and market segment. SpagoBI has a modular structure integrated into the core system. It guarantees solidity and harmony to the platform and a great extension capability.
Not all the SpagoBI modules are always necessary for every project: you can use the appropriate SpagoBI module for your project. You can start using a single module with the guarantee that further extensions will be easy because everything is inserted in an overall vision.
SpagoBI uses many technologies and products already available as Free Open Source Software: the first one is Spago,the J2EE Framework already released by Engineering Ingegneria Informatica S.p.A. and Sinapsi. Therefore, SpagoBI inherits Spago's features and technical characteristics, using them in its specific context:

* it manages specific Business Intelligence objects as reports, OLAP analysis, Dashboards and Scorecards views and Data Mining models;
* it is focused on data management and on information content and context;
* it supports the BI systems's administrators in the control, validation, certification and distribution process of the Business Intelligence objects.

According to its developement road-map, the main SpagoBI features will be the following:

* BI Portlets: every BI object will be distributed for the end-user through portlet technology (JSR 168). In this way portlets and the BI objects will be managed and encapsulated into the portal already chosen for the specific enterprise solution (even if the portal is realized by means of commercial products).
* Reporting, OLAP Analysis, Data Mining, geographical analysis, ETL processes, Dashboards and Scorecards are the BI objects managed by SpagoBI, everyone with its own execution engine and development environment. SpagoBI manages the production and validation cycles, the parametric activation, the navigation, the results versioning and storage in a similar way although every BI object maintains its distinctive characteristics.
o Reports realize the structured information views; they have a greater diffusion level according to a static structure (.pdf, .xls, .csv, .html, etc). SpagoBI enables the navigation capability between different reports, inheriting the parameters.
o The multidimensional structures for the OLAP analysis add a higher degree of freedom and variability. The analysis axis and the observation measures are structured, this enables to obtain a data examination at various detail levels and from various perspectives, by means of drill-down, drill-across, slice and dice operations.
o In the Management Performance context, SpagoBI provides many widgets for dashboards structuring and the parametric evaluation of performance scores.
o Data Mining algorithms and processes (Neural Networks, Decision Trees, etc.) enable data analysis, with the aim to find out hidden information. SpagoBI supports Data Mining models implementations and their results analysis trough the other Business Intelligence objects.
o Geographical analysis permit business data representation on geospatial maps.
o In the data extraction, transformation and loading (ETL) processes context, SpagoBI supports ETL processes execution; their results can be analyzed through other Business Intelligence SpagoBI platform tools.
* Query-by-example: it offers a visual mode for data inquiring. It is possible to save the structure as a template for subsequent reports development, or to export the results for external elaborations (ex. cvs, XML).
* Interaction with source systems: it provides connectors, protocols and services for bidirectional data exchange with source systems.
* Meta Repository: SpagoBI is a really integrated platform instead of a confused set of products thanks to metadata. The meta-repository contains all information about data (technical and business metadata), processes and rules for the platform management.
* Users profile: it is possible to differentiate the functionalities access rights according to the user's role.
* Documents management: it is a versioned repository for all the relevant results provided by the BI objects, in a scheduled way too. Research and detection functionalities for these documents are provided.
* Scheduling: an off-line activation of all transport and/or import/export data processes, document production, storage and destruction, etc. is provided.
* Workflow: SpagoBI manages the approval and certification flow for BI objects and for the relative elaborated documents.
* Administration: it concerns in a support to the management functionalities of the whole platform.
* Logging/Auditing: corresponding to some control services for the platform's functional and performance monitoring.



Ref:http://www.google.lk/search?q=spagoBI+features&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

Reference :-http://spagobi.eng.it/SpagoBISiteENG/target/docs/features.html
Date- 2010-04-02

Jasper reports

The Jaspersoft Business Intelligence (BI) Suite delivers a low cost alternative to Reporting, Dashboarding, Analysis, and Data Integration so organizations can make better decisions without breaking their budget!
Enterprise-Class BI For Any Size Company
Lots of organizations need business intelligence. But not all organizations need the same capabilities, feature sets, and level of performance. That‘s why Jaspersoft offers several product versions and Editions to help tailor the right solution to the right problem.


Tuesday, March 2, 2010

Open source tool - RapidMiner

Report the Future.

RapidMiner is unquestionable the world-leading open-source system for data mining. It is available as a stand-alone application for data analysis and as a data mining engine for the integration into own products. Thousands of applications of RapidMiner in more than 40 countries give their users a competitive edge.



* Data Integration, Analytical ETL, Data Analysis, and Reporting in one single suite
* Powerful but intuitive graphical user interface for the design of analysis processes
* Repositories for process, data and meta data handling
* Only solution with meta data transformation: forget trial and error and inspect results already during design time
* Only solution which supports on-the-fly error recognition and quick fixes


Enterprise Edition: Comparison
RapidMiner
Community
Edition Enterprise
Edition
Small Standard Developer
Download Now Inquire Now Inquire Now Inquire Now
General
Number of Users Unlimited Unlimited Unlimited Unlimited
License Open Source Open Source or Closed Source Open Source or Closed Source Open Source or Closed Source
Certified No Yes Yes Yes
Integration
Into Open-Source Software Yes Yes Yes Yes
Into Closed-Source Software No No No Yes
Into Web Services No No No Yes
Guarantees
Guarantee for Bugfixes No Yes Yes Yes
Intellectual Property Indemnification No Yes Yes Yes
Warranty for Services No Yes Yes Yes
Problem Resolution Support
Community Forums Yes Yes Yes Yes
Community Web Documentation (Wiki) Yes Yes Yes Yes
Service Level Agreement No No Yes Yes
Number of Incidents No No Unlimited Unlimited
Web-based Case Management No No Yes Yes
Mail Support No No Yes Yes
Support Access No No Business Hours Business Hours
Maximum Initial Response Time No No 4 hours 4 hours
Emergency Hot Fix Build No No Yes Yes
Consultative Support
Remote Troubleshooting No No Yes Yes
Process Review No No Yes Yes
Process Optimization No No Yes Yes
Performance Tuning No No Yes Yes
Customer Code Review No No No Yes
Maintenance
Software Maintenance By In-house Staff By Rapid-I Engineers By Rapid-I Engineers By Rapid-I Engineers
Updates via Update and Installation Server Yes Yes Yes Yes
Software Installation for Extensions Yes Yes Yes Yes
Patch Releases No Yes Yes Yes
Fixes Included in Future Releases No Yes Yes Yes
Stabilized and Certified Software Releases No Yes Yes Yes
Managed Release Cycles No Yes Yes Yes
Download Now Inquire Now Inquire Now Inquire Now

Monday, March 1, 2010

Open source BI tools

Business intelligence tools are a type of application software designed to report, analyze and present data. The tools generally read data that have been previously stored, often, though not necessarily, in a data warehouse or data mart.

Types of business intelligence tools
The key general categories of business intelligence tools are:

Spreadsheets[1]
Reporting and querying software - are tools that extract, sort, summarize, and present selected data
OLAP
Digital Dashboards
Data mining
Process mining
Business performance management
Local information systems
Except for spreadsheets, these tools are sold as standalone tools, suites of tools, components of ERP systems, or as components of software targeted to a specific industry. The tools are sometimes packaged into data warehouse appliances.

Except for spreadsheets, these tools are sold as standalone tools, suites of tools, components of ERP systems, or as components of software targeted to a specific industry. The tools are sometimes packaged into data warehouse appliances.

[edit] Open source free products
Eclipse BIRT Project: Eclipse-based open source reporting for web applications, especially those based on Java EE.
JasperSoft offers a version of JasperReports called JasperReports Community which has comprehensive basic reporting capabilities. A commercial edition is also available with more features.
RapidMiner community edition (formerly YALE) is an open-source software for data analysis, knowledge discovery, data mining, predictive analytics, and machine learning. A commercial edition is also available with more features.
SpagoBI: uses Free Open Source Software tools to provide a unified Free Platform for the development of Business Intelligence solutions at enterprise level.


[edit] Open source commercial products
Palo (OLAP database): OLAP Server, Worksheet Server and ETL Server
Pentaho: Reporting, analysis, dashboard, data mining and workflow capabilities
[edit] Proprietary free products
Freereporting.com: Free browser-based reporting software by LogiXML, available for .NET or Java. More advanced BI features such as dashboards, analysis grid and interactive data viewer are not offered in Freereporting.com but are available in LogiXML's full-featured managed reporting software, Logi Info.
InetSoft offers Visualize Free as a Web-only based visualization application where you upload your data to their server.
MicroStrategy offers a non-restricted free version of their MicroStrategy Reporting Suite.
[edit] Proprietary products
ACE*COMM
Ab Initio
ActiveReports
Actuate
COA Solutions
ComArch
CyberQuery
Data Applied
Dimensional Insight
HP Neoview
IBM
Applix
Cognos
InetSoft
Informatica
Information Builders
InfoZoom
Izenda
Jreport
LogiXML
LucidEra
Microsoft
SQL Server Analysis Services
PerformancePoint Server 2007
Proclarity
MicroStrategy
m-Power
Oracle Corporation
Hyperion Solutions Corporation
Oracle Business Intelligence Suite Enterprise Edition
Panorama Software
Pebble Reports
Pentaho
Pervasive DataRush
Pilot Software, Inc.
PRELYTIS
Qliktech
SAP Business Information Warehouse
Business Objects
OutlookSoft
SAS Institute
Siebel Systems
Spotfire (now Tibco)
StatSoft
SPSS
Sybase IQ
Tableau Software
Teradata
Thomson Data Analyzer


Reference :http://en.wikipedia.org/wiki/Business_intelligence_tools#Open_source_free_products

Date : 01-March-2010

Wednesday, February 24, 2010

Business Intelligence (BI) tools

Business Intelligence (BI) tools are widely used for reporting, dashboarding and analysis. The following BI tools were thoroughly examined on 70 criteria considered important for high productivity and Business Intelligence systems that actually add value to your organization. In random order.

No. Business Intelligence Tool Version Vendor
1. Oracle Enterprise BI Server 7.8 Oracle
2. SAP Business Objects Enterprise XI r2 SAP
3. SAP NetWeaver BI 7.0 SAP
4. SAS Enterprise BI Server 9.2 SAS Institute
5. SSAS, SSRS & Excel Services* 2008 Microsoft
6. IBM Cognos Series 8 8.3 IBM
7. Board Management IntelligenceToolkit 7.1 Board International
8. BizzScore Suite 7.2 EFM Software
9. WebFocus 7 Information Builders
10. QlikView 9 QlikTech
11. Microstrategy 9 Microstrategy
12. Oracle Hyperion System 9 Oracle
13. Actuate 9.1 Actuate

Categories and criteria examined

Below are listed all criteria we researched. First some company and market characteristics are given. In the middle of this page there are several links you may click on to see the more technical criteria like architecture, functionality, usability, search and alerting, security, connectivity, costs and so forth. All criteria are explained in more detail in the full BI Tool Survey Report which is available for purchase.

Common Business Intelligence Tool information

Productname(s)

The name of the product(s). If there are multiple products necessary for providing above defined functionality, give here the various names.

Version number(s)

The version numbers of the product(s).

Number of customers WW

How many customers are using the product(s), worldwide (rough estimate)?

Three largest implementations WW

What are the three largest implementations in terms of the number of users

Number of resellers/partners

How many local partners / resellers do you have an agreement with? Please provide figures for worldwide, Europe and the Benelux.

* Architecture
* BI Tool Functionality
* Ease-of-Use
* Search and alerting
* Security and Connectivity
* Costs

Architecture


Load-balancing and clustering

Does the product support load balancing and clustering? If so, describe how it is arranged.

Fail-over

Does the product support fail-over? If so, describe how it is arranged.

Zero-footprint viewer

Are the products fully web-based and support they the zero-footprint concept (e.g. no software is downloaded or needed on the client)

Zero-footprint definer

If so, does it hold true for the software where one defines reports, dashboards and analysis?

Web 2.0 / AJAX supported Is Web 2.0 / Ajax supported?

Number of portlets technologies What kind of portlets does the product support? For example IBM Websphere or BEA WebLogic Portal?

Portlets zero-footprint
Does the portlets have a zero-footprint?

Portlets certified

Are the portlets certified or validated by the vendor of the portal technology?

Platforms On what platforms does the product run? Please specify this for the design environment, the end-user environment (the clients) and the server environment.

Reporting, dash boarding and analysis All-in-one Does your solution provide above defined functionality within one environment, software package or screen? In this case reporting, interactive analysis and dash boarding can be combined very easily.

Functions in a Service Oriented Architecture Does the product function well in a Service Oriented Architecture (SOA)? If so, describe here at least two examples.

Supports 64-bit architecture Does the product support a 64-bit architecture?

Supports In-Memory or Caching Does the product use In-Memory technologies or caching? If so, describe here the restrictions and conditions for using it properly.

Designs stored once in a repository Is the design and content of a table or graph (the query and the visualization) stored once and is it reusable for multiple purposes, for example when users want to see it from different perspectives like month, year, customer, product et cetera.

Designs are resuable across BI applications Is it stored in an (open) repository and is it possible to use it in multiple BI applications?

BI Tool Functionality

Role-based dash boarding and reporting Does the product support the concept of role-based reporting, dash boarding and analysis? For example when a user logged on, he sees only the information (portlets, graphs, tables, gauges, data) he is authorized for?

Common drill-down paths stored in repository Are common drill-down paths and hierarchies stored in a repository, and can they being made dependent on the role users have? For example the first drill-down path of a product manager is to drill into the product groups; the first drill-down path of the director sales is region.

Do they depend on the role users have? Can they being made dependent on the role users have? For example the first drill-down path of a product manager is to drill into the product groups; the first drill-down path of the director sales is region.

Standard reports can be adjusted Is it possible for end-users to adjust standard reports or dashboards, for example delete a column, filter the data, change the sorting et cetera?

Can the adjustments be saved for that particular user Can the adjustments be saved for that particular user in a way that when he logon on the next time the adjustments are still in place?

Export to Excel, formatting included Is an export of the data and/or formatting to Excel available?

Export to PDF Is an export of the data and/or formatting to PDF available?

Attach notes to figures and distribute them Can one attach notes to the figures, graphs, dashboards, and distribute these notes to a particular user group or a specific user?

Notes can be linked to specific dimension member If so, can these notes being linked to a specific period, product or customer (e.g. dimension member(s))?

Full Syntax of SQL SELECT supported Is the full syntax of the SELECT statement supported (for example inner joins, outer joins, unions, sub queries, group by) to fill a dataset used by a portlet, report, dashboard gauge or graph?

Supported by menus If so, can one select all elements of the SELECT statement by using menus instead of typing?

Can this be done by an end-user If so, can this be done by an end-user?

Basket analysis supported Are basket analysis supported?

Write-back facilities Does the product support write-back facilities to change the data? This is mainly for planning and budgeting purposes.

Keeps history when figures are changed If so, does the product support storing the history of these planning and budgeting figures in the database when the figures are changed by the user?

Support for Slowly changing dimensions Does the product explicitly support slowly changing dimensions (type 1 and 2) and displays it automatically (in case of type 2) the user two options (current or historical view), if required and relevant.

Support for Balanced scorecarding Does the product support the balanced scorecard methodology in terms of key performance indicators, perspectives, critical success factors, strategy maps and actions?

BI self-service supported Does the product support the concept of BI self-service, for example can one make or change his own dashboard, reports or lay-out?

Scheduled distribution of reports/dashboards Can reports / dashboards distributed by e-mail on a regular, scheduled basis?

Publish and subscribe Does the product support the principle of publish and subscribe?

Is the product real-time aware Is the product real-time aware, for example if new data has been arrived in the database, the report or dashboard is refreshed instantly?

Usability

Ease-of-use The ease-of-use of the product. Is it easy to learn and easy to use on a daily basis?

Screen design Does the screen look quiet and well-balanced?

Task compatibility Does the tool support the tasks (in the same sequence) as the BI developer?

Number of graphs and visualizations What type of graphs/visualizations can be chosen? Mention them all.

Conditional formatting Is conditional formatting supported?

Style-sheets supported (css) Does the product make use of style-sheets (css)?

Drill-down supported on elements in graphs Is drill-down supported on specific elements of graphs if required and relevant? For example, a user wants to drill-down on the combination of product type and quarter.

Support for PDA's and BlackBerries Are the reports / dashboards optimized for viewing on PDA’s and Blackberries?

Search and alerting

Meta data search Is there a search facility to search reports / dashboards (search over meta data)?

Search over data in reports and dashboards Is there a search facility to that support finding specific data that is stored in the reports or dashboards (search over data)?

Alerts and notifications Can one define alerts and notifications?

Are they RSS compatible? If so, are they RSS compatible?

Security and connectivity

Single sign-on supported Does the product support single sign-on? If yes, please describe how it works.

Support for Active or Enterprise Directories Does the product make automatically use of Active Directories or Enterprise Directories?

Data authorization available Is it possible to define for each user, user group or role what data one is authorized to see?

On rows/columns/both If so, is it applicable for rows, columns or rows and columns?

Are reports and dashboards automatically adjusted If so, is the report or dashboard automatically adjusted for that particular user? If yes, described here how it works?

Does it make use of database authorizations If authorizations are stored in the repository of the database, can the product make use of it?

Number of native connections to data sources List here the data sources the product can natively connect to? E.g. Oracle, DB2, SQL Server 2005, MS Analysis Services et cetera.

Does the product has a repository Is there a repository, if so describe here the functions of it?

Impact-analysis Does the product support impact analysis?

Data lineage Does the product support data lineage?

SDK and customization of the product Dis there a system development kit (SDK), so one can build his one portal, with a specific look-and-feel and functions?

SDK has same functions as clients If so, does the SDK contain/support the same possibilities and functions as if one were working with the client interface?

Costs

User population: 10% pure analysts, 20% report viewers and 70% a combination of dash boarding, reporting and analysis

Pricing 100 users Please provide us a rough estimate of the list price for 100 users

Pricing 2 processors Please provide us a rough estimate of the list price for running it on two processors?

Pricing 2000 users Please provide us a rough estimate of the list price for 2,000 users

Pricing 8 processors Please provide us a rough estimate of the list price for running it on eight processors?

List price per database connection If the price depends on connections to specific data source, please provide here a rough estimate of the list price per connection?

Additional costs If there are any additional costs, please provide here.

Price end-user training What are the costs for attending the end-user training? (price per user)

Price designer training What are the costs for attending the designer training (price per user)

Price advanced designer training If applicable, what are the costs for attending the advanced designer training (price per user)

Percentage for support and maintenance What is the percentage one has to pay for maintenance and support?



REFERENCE : http://www.businessintelligencetoolbox.com/
DATE : 25-02-2010

Tuesday, February 23, 2010

Data warehousing tools

The selection of business intelligence tools and the selection of the data warehousing team. Tools covered are:

Database, Hardware
Oracle
MS SQL Server
IBM DB2
Teradata
Sybase
MySQL

ETL (Extraction, Transformation, and Loading)

IBM WebSphere Information Integration (Ascential DataStage)
Ab Initio
Informatica
Talend
OLAP
Business Objects
Cognos
Hyperion
Microsoft Analysis Services
MicroStrategy
Pentaho
Palo OLAP Server
Reporting
Business Objects (Crystal Reports)
Cognos
Actuate
Metadata

Data warehousing concepts

Dimensional data model is most often used in data warehousing systems. This is different from the 3rd normal form, commonly used for transactional (OLTP) type systems. As you can imagine, the same data would then be stored differently in a dimensional model than in a 3rd normal form model.
To understand dimensional data modeling, let's define some of the terms commonly used in this type of modeling:
Dimension: A category of information. For example, the time dimension.
Attribute: A unique level within a dimension. For example, Month is an attribute in the Time Dimension.
Hierarchy: The specification of levels that represents relationship between different attributes within a dimension. For example, one possible hierarchy in the Time dimension is Year → Quarter → Month → Day.


In the star schema design, a single object (the fact table) sits in the middle and is radially connected to other surrounding objects (dimension lookup tables) like a star. Each dimension is represented as a single table. The primary key in each dimension table is related to a forieng key in the fact table.

Sample star schema

All measures in the fact table are related to all the dimensions that fact table is related to. In other words, they all have the same level of granularity.
A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table.





Conceptual, Logical, And Physical Data Models

The three level of data modeling, conceptual data model, logical data model, and physical data model, were discussed in prior sections. Here we compare these three types of data models.
Below we show the conceptual, logical, and physical versions of a single data model.

Conceptual Model Design














Logical Model Design

















Physical Model Design




Reference website - http://www.1keydata.com/datawarehousing/

Date - 23-Feb-2010

Data Warehousing