Mondrian and OLAP
???Mondrian 是使用java編寫的OLAP引擎,它執(zhí)行MDX語言描述的查詢,可以從關(guān)系數(shù)據(jù)庫中獲取數(shù)據(jù)。
Online Analytical Processing (OLAP)
???OLAP是指實(shí)時地分析大數(shù)據(jù)量,與Online Transaction Processing (OLTP)不同。
Year
|
2000
|
2001
|
Growth
|
Product
|
Dollar sales
|
Unit sales
|
Dollar sales
|
Unit sales
|
Dollar sales
|
Unit sales
|
Total
|
$7,073
|
2,693
|
$7,636
|
3,008
|
8%
|
12%
|
Books
|
$2,753
|
824
|
$3,331
|
966
|
21%
|
17%
|
Fiction
|
$1,341
|
424
|
$1,202
|
380
|
-10%
|
-10%
|
Non-fiction
|
$1,412
|
400
|
$2,129
|
586
|
51%
|
47%
|
Magazines
|
$2,753
|
824
|
$2,426
|
766
|
-12%
|
-7%
|
— Greetings cards
|
$1,567
|
1,045
|
$1,879
|
1,276
|
20%
|
22%
|
從上表可以看到y(tǒng)ear和Product為dimensions (維度),measures 為'Unit sales' and 'Dollar sales'。
Layers of a Mondrian system
???Mondrian OLAP System 包含4個層:
???1、the presentation layer ?????????用于描述最終用戶所看到的,其表現(xiàn)形式有很多,如:pivot表、pie、line和bar圖、maps和動態(tài)圖。由用戶提問,OLAP服務(wù)器返回答案。 ???2、the dimensional layer ?????????這一層解析、驗證和執(zhí)行MDX?查詢,首先計算軸,再計算軸上所有單元的值。metadata用于描述空間模型,及空間模型怎樣映射到關(guān)系模型上。 ???3、the star layer ?????????它負(fù)責(zé)維護(hù)一個聚合的cache,aggregation 聚合是一些內(nèi)存中的測量值(cells),及一些維度值。如果請求的cells不在cache中,則aggregation manager?向the storage layer發(fā)出請求。 ???4、and the storage layer ?????????the storage layer為RDBMS。

What is MDX?
???'multi-dimensional expressions'為Mondrian執(zhí)行的查詢。 ???下面為一個基本的查詢: SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, {[Product].members} ON ROWS FROM [Sales] WHERE [Time].[1997].[Q2]
What is a schema??
?schema定義了一個多維數(shù)據(jù)庫。它包含一個邏輯模型,由多個cube、hierarchies、members及一個到物理模型的映射組成。 ?邏輯模型包含:cubes, dimensions, hierarchies, levels, and members.
A cube is a collection of dimensions and measures in a particular subject area.
A measure is a quantity that you are interested in measuring, for example, unit sales of a product, or cost price of inventory items.
A dimension is an attribute, or set of attributes, by which you can divide measures into sub-categories. For example, you might wish to break down product sales by their color, the gender of the customer, and the store in which the product was sold; color, gender, and store are all dimensions. 例如: <Schema> ? <Cube name="Sales"> ??? <Table name="sales_fact_1997"/> ??? <Dimension name="Gender" foreignKey="customer_id"> ????? <Hierarchy hasAll="true" allMemberName="All Genders" primaryKey="customer_id"> ??????? <Table name="customer"/> ??????? <Level name="Gender" column="gender" uniqueMembers="true"/> ????? </Hierarchy> ??? </Dimension> ??? <Dimension name="Time" foreignKey="time_id"> ????? <Hierarchy hasAll="false" primaryKey="time_id"> ??????? <Table name="time_by_day"/> ??????? <Level name="Year" column="the_year" type="Numeric" uniqueMembers="true"/> ??????? <Level name="Quarter" column="quarter" uniqueMembers="false"/> ??????? <Level name="Month" column="month_of_year" type="Numeric" uniqueMembers="false"/> ????? </Hierarchy> ??? </Dimension> ??? <Measure name="Unit Sales" column="unit_sales" aggregator="sum" formatString="#,###"/> ??? <Measure name="Store Sales" column="store_sales" aggregator="sum" formatString="#,###.##"/> ??? <CalculatedMember name="Profit" dimension="Measures" formula="[Measures].[Store Sales]-[Measures].[Store Cost]"> ????? <CalculatedMemberProperty name="FORMAT_STRING" value="$#,##0.00"/> ??? </CalculatedMember> ? </Cube> </Schema>
? MDX查詢?yōu)椋?br />
SELECT {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS, ? {[Time].[1997].[Q1].descendants} ON ROWS FROM [Sales]
WHERE [Gender].[F]
[Time] | [Measures].[Unit Sales] | [Measures].[Store Sales] | [1997].[Q1] | 0 | 0 | [1997].[Q1].[Jan] | 0 | 0 | [1997].[Q1].[Feb] | 0 | 0 | [1997].[Q1].[Mar] | 0 | 0 |
A member is a point within a dimension determined by a particular set of attribute values. The gender hierarchy has the two members 'M' and 'F'. 'San Francisco', 'California' and 'USA' are all members of the store hierarchy. A hierarchy is a set of members organized into a structure for convenient analysis. For example, the store hierarchy consists of the store name, city, state, and nation. The hierarchy allows you form intermediate sub-totals: the sub-total for a state is the sum of the sub-totals of all of the cities in that state, each of which is the sum of the sub-totals of the stores in that city. A level is a collection of members which have the same distance from the root of the hierarchy. A dimension is a collection of hierarchies which discriminate on the same fact table attribute (say, the day that a sale occurred).
mondrian.properties ???mondrian有一個配置文件可以定義它是如何運(yùn)行的。
a simple star schema
???
Cube [Sales] has two measures [Unit sales] and [Dollar sales]
Dimension [Product] has levels [All Products] , [Manufacturer] , [Brand] , [Prodid]
Dimension [Time] has levels [All Time] , [Year] , [Quarter] , [Month] , [Day]
Dimension [Customer] has levels [All Customers] , [State] , [City] , [Custid]
Dimension [Payment Method] has levels [All Payment Methods] , [Payment Method]
|