Report data volume
Visioneo reportlet is designed to fit in an existing architecture, without adding any further resources. Your Birt reports must be developed so that they won't struggle portal servers. Build small, agile, highly parameterizable reports is the most important way to achieve that. This would obviously require a whole book and much debates, the article addresses a very common design mistake.
When your reports have to deal with large data sources, there are several ways to manage filters and aggregations. We must distinguish:
- Data source volume
- Data volume returned by your SQL / ODA query
- Data volume returned by your dataset
- Data volume returned by your report elements (tables, cross-tables, charts)
Note there should not be any correlation between a data source volume and a report volume: for example a report can use a data source with 1 billion rows , and aggregate results in a single output cell. Depending on where report filters and aggregations are applied, this will change everything in your project architecture and performances!
Consider the report sample below, a cross-table with about 500 output cells. It is using the default Birt "Classic models" database. But for this article, let's imagine it is based on a 20 millions rows table, aggregated and filtered to produce this result.
Main filters step are:
- Filter on report parameters: for example Products category, Offices and year Y & Y-1. Let's say once these filters are applied, in our example there are still 1 million rows to process
- Filter final output on year Y. The Year Y-1 is included in the filter above to allow time series calculations, but is not displayed
And main aggregations are:
- Aggegrate to the report output level (Month, Group, Product Category). In our example (7 categories * 12 months * 2 years) it results in about 200 rows.
- Time series calculations (Year-to date Y & Y-1, 12 months rolling, ...)
- Totals (by month, by category)
Following scenarios illustrate where these operations can be processed in a Birt report design.
As you can see, your processor cores are knocked out in scenario 1, out of breath in scenario 2, work pretty well in scenario 3 and enjoy scenario 4.
Therefore, you should always process large volumes in your report dataset sql statements :
- Filter in a "where" sql clause using birt report parameters rather than in dataset / cross-tables filters
- Aggregate in "group by" sql clause rather than in datacube aggregation / table groups.
Of course there are huge differences between DBMS, for example
- Oracle can generate transparent aggregates on fact tables,
- Infobright or Sybase IQ are columnar databases and will evaluate large queries in a sub-second time
- Teradata is the master of intra/inter queries parallelization
But even with a simple DBMS such MySQL, by applying filters and aggregations in sql queries, you will always be able to easily optimize Birt reports by building an aggregate table and indexes, for example. Furthermore, you should always consider using Stored Procedures for large/complexe queries, such described in the article "Stored procedures vs SQL queries".
With this simple principle you won't struggle with performances issues, and will be able to focus on Birt added values features such time series caculations, derived measures, report parameters, formatting etc. No need to have millions rows to be concerned by this article! Even if your reports handle a few thousands rows, you will always take advantage of reducing an upstream volume.
Limit per query
Visioneo reportlet has a parameter to protect your server against too large report datasets. In
<reportlet webapp root>/WEB-INF/portlet.xml:
This parameter prevents a dataset to receive more than 20000 rows per query. If this limit is exceeded, no exception is thrown: the report is normally processed with the first 20000 rows received. That means with our fictive example above, scenarios 1 & 2 would only process 0,1% data, when scenarios 3 & 4 would work as expected.
You can increase this value, but if you have carefully read the article this may not be something you want to do! Add a required parameter or aggregations in your dataset sql instead! Remember there is no direct correlation between your data source volume and the volume you really need to extract.
In practice, well-designed reports which absolutely have to get more than 20000 rows are very rare, whatever your company size is. Most of the time, it is required when Birt is used as a data extractor, to produce a csv / excel export to load some local databases.
Birt required / facultative parameters 11 April 2016
Document & rendering caches 08 February 2016
How to define reports access permissions 06 November 2015
How to define a portal repository 28 October 2015
Inter-portlet communication 26 October 2015
Server-side Autocomplete 14 October 2015
How to disable the load fragment window 08 October 2015
How to automatically adjust charts size 10 September 2015
How to hide parameters labels 07 April 2015
How to add reports & subfolders 01 April 2015
Birt reports compatibility 310931 Views
How to add reports & subfolders 303600 Views
How to build your own theme 288424 Views
How to modify reportlet styles 287084 Views
Inter-portlet communication 276436 Views
Document & rendering caches 273396 Views
Birt required / facultative parameters 263030 Views
Ajax inter-portlet communication 259024 Views
Reports datasources: Stored Procedure vs SQL query 247252 Views
How to automatically adjust charts size 232609 Views